Deploying Fine-Tune Falcon 40B with QLoRA on Sagemaker Inference Error

Hello,

We updated the script and requirements to merge and save the weights in the safetensors format when training. Meaning there is no conversion needed on the LLM inference container side.

You can update the script and requirements, and it should work when deploying after training.

2 Likes