Issues after fine-tuning BLOOMZ-3b using peft library

kgBolt · July 1, 2023, 9:33am

I was trying to use the bloomz-3b model and fine-tune it using LoRA, the peft library of huggingface. All done, but when I upload the model back to hf after merging the LoRA weights with the main model, the model size being uploaded is 12 GB instead of 6GB, hence making it too big for inference API in hf. It even times out when I use it for the Langchain huggingface hub. How do I proceed from here?

Topic		Replies	Views
Issues after finetuning BLOOMZ-3b with peft library Intermediate	0	207	July 1, 2023
Challenges with Uploading Merged LoRA-Enhanced Model to Hugging Face and Langchain Hub Models	0	570	July 1, 2023
Having trouble loading a fine-tuned PEFT model (CodeLlama-13b-Instruct-hf base) 🤗Transformers	2	4316	October 6, 2024
Issue with ALLaM-7B Model in Inference API - Size Limitation Error Inference Endpoints on the Hub	1	56	March 7, 2025
Further finetuning a LoRA finetuned CausalLM Model 🤗Transformers	17	10742	July 7, 2024

Issues after fine-tuning BLOOMZ-3b using peft library

Related topics