Any model's size is huge when saved as opposed to downloading from hub pretrained

Hi, I have noticed that when i download a model say falcon 1b, the size of the bin file downloaded is 2.63 gb, but when i try to save the same thing using model.save_pretrained() the size is ~ 5gb. Any idea why ?

The speed is also slower as opposed to the original model downloaded when full finetuned. (no lora)

When loading into cuda directly with from_pretrained the issue goes away. It loads and saves with the same memory and tok speed.

The problem arises when u dont map to device default and then move it to cuda using model.to(device=“cuda”)

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.