Issues after finetuning BLOOMZ-3b with peft library

I was trying to use bloomz-3b model and fine-tune it using LoRA, peft library of huggingface. All done, but when I upload the model back to hf after merging the LoRA weights with the main model, the model size is uploaded in 12 GB instead of 6GB, hence making it too big for inference API in hf. It even times out when I use the model for the Langchain huggingface hub, but that does happen with the bloomz-3b base model.