Challenges with Uploading Merged LoRA-Enhanced Model to Hugging Face and Langchain Hub

kgBolt · July 1, 2023, 9:38am

I attempted to utilize the bloomz-3b model and fine-tune it using the peft library from Hugging Face, specifically with LoRA. After successfully merging the LoRA weights with the main model, I encountered an issue when uploading the model back to Hugging Face. Surprisingly, the uploaded model size inflated to 12 GB instead of the expected 6 GB, which surpasses the size limit for the inference API on Hugging Face. Additionally, when attempting to use the model with the Langchain Hugging Face hub, it times out due to its excessive size.How do I proceed?

Topic		Replies	Views
Issues after fine-tuning BLOOMZ-3b using peft library Intermediate	0	286	July 1, 2023
Issues after finetuning BLOOMZ-3b with peft library Intermediate	0	207	July 1, 2023
Issue with Deploying LoRA-adapted Model on Hugging Face Endpoint Beginners	10	112	April 26, 2025
Problem in deploying on hugging face hub Models	3	592	June 30, 2024
About Adapter Fusion 🤗Hub	3	126	October 10, 2024

Challenges with Uploading Merged LoRA-Enhanced Model to Hugging Face and Langchain Hub

Related topics