Push 4-bit converted model to hub

Hey there,

I am quantizing a model to 4-bit using BitsandBytes, and when I try to push the model to the hub I am getting the following error:

You are calling `save_pretrained` on a 4-bit converted model. This is currently not supported

I’ve pushed a 4-bit converted model to the hub before after finetuning it with peft, however I was wondering whether I can do it without going through that path.

I know GPTQ gives me more options, but it only works for text models.

Alternatively I can go through the finetuning and use methods like pruning, distillation to get the model smaller, but I am wondering if there is work around for pushing a 4-bit converted model to the hub

Following PRs do handle this issue^

I will just wait until they are merged :slight_smile:

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.