Issues after fine-tuning BLOOMZ-3b using peft library

I was trying to use the bloomz-3b model and fine-tune it using LoRA, the peft library of huggingface. All done, but when I upload the model back to hf after merging the LoRA weights with the main model, the model size being uploaded is 12 GB instead of 6GB, hence making it too big for inference API in hf. It even times out when I use it for the Langchain huggingface hub. How do I proceed from here?