Pushing a quantized (4bit) model on the Hub

Hi. I trained a model in 4bit. After the training, trying to push the model to the Hub, and I get this error message:

NotImplementedError: You are calling save_pretrained on a 4-bit converted model. This is currently not supported.

Although the error message says ‘save_pretrained’ my code is using ‘push_to_hub’:
base_model.push_to_hub(/)

When is this function be supported? Thank you very much!!

7 Likes

push_to_hub uses save_pretrained in order to save the model in physical format (i suspect) and then upload to cloud (HG hub)

Thanks for the comment, but that still doesn’t solve the issue. I tried again, same error message. I’m not sure (and doubt) if 4-bit model is still not supported? If so, what am I doing wrong?

Can someone please provide answer or insight to this issue?

Thanks!

It is supported for sure. My problem was solved after I used peft 0.4.0 I think. Cheers

I have peft==0.4.0 and I get the same error message when trying to save a 4-bit converted model.

2 Likes

The docs aren’t entirely clear, but my read is that 8-bit is possible but 4-bit is not:

Note that once a model has been loaded in 4-bit it is currently not possible to push the quantized weights on the Hub. Note also that you cannot train 4-bit weights as this is not supported yet. However you can use 4-bit models to train extra parameters, this will be covered in the next section.

1 Like