I have fine-tuned a model using QLoRA (4-bit) and then pushed it to the hub. Then, I tried to test it with inference API, but I get the following error:
No package metadata was found for bitsandbytes
The model:
- base model is meta-llama/Llama-2-7b-chat-hf
- fine-tuned on a custom dataset of 50 samples (I am just testing around)
- I made the model public and can be found here
- this is the colab notebook I used to train it. Note that, after the QLoRA training, I merged the adapter model with the base model, THEN pushed the result to the hub. So the model in the hub is a transformer model. Specifically, a
transformers.models.llama.modeling_llama.LlamaForCausalLM
- this is a colab notebook that can be used for testing. Note that the test works for the base model
meta-llama/Llama-2-7b-chat-hf
My suspect is that the docker container behind the inference API does not know that it needs to install bitsandbytes. Is there a way to “tell it”? maybe a tag in the README?
3 Likes
Getting the same error, have you found anything related to the issue?
2 Likes
Unfortunately not. Currently I just gave up But let me know if you find something.
1 Like
yeah, same. I tried including requirements.txt, every pip install in that text file, but then too the same error. Don’t know how to figure it out.
2 Likes
I have the same problem. Might this be an issue on the server side? I hope someone from HF looks into it.
1 Like
facing the same issue here.
Did anyone find a solution for this?
The same here. Any advice?
Hello, I had a similar issue recently.
First, make sure to have a requirements file with the packages listed.
Also, check that you have a torch version compatible with bitsandbytes.
Finally, if working with docker, you can either specify it into the requirements file or directly in the docker file by writing something like:
RUN pip install -U bitsandbytes
You can check locally by building the image with docker run to check how the building goes.