Deploying Fine-Tune Falcon 40B with QLoRA on Sagemaker Inference Error

malterei · July 21, 2023, 10:02pm

Building the latest TGI container image for SageMaker (GitHub - huggingface/text-generation-inference at v0.9.3) and the other instructions I described above was what makes 7b work for me.

I don’t have the time right now to train a 40b with my instructions.

If you have time maybe you can try my instructions with 7b or 40b to validate them?

Topic		Replies	Views
QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image Amazon SageMaker	14	2990	August 18, 2023
Unable to deploy Falcon 40b OASST1 model into SageMaker TGI container Amazon SageMaker	0	434	July 29, 2023
Falcon 40B instruct training with QLora, Sagemaker model artifact location Amazon SageMaker	3	399	September 21, 2023
Error loading finetuned llama2 model while running inference Amazon SageMaker	27	4816	September 20, 2023
Inference failed for FLAN-UL2(20B) on SageMaker Amazon SageMaker	6	2170	April 4, 2023