Inference API for fine-tuned model not working: No package metadata was found for bitsandbytes

FDeRubeis · March 1, 2024, 10:45am

I have fine-tuned a model using QLoRA (4-bit) and then pushed it to the hub. Then, I tried to test it with inference API, but I get the following error:
No package metadata was found for bitsandbytes

The model:

base model is meta-llama/Llama-2-7b-chat-hf
fine-tuned on a custom dataset of 50 samples (I am just testing around)
I made the model public and can be found here
this is the colab notebook I used to train it. Note that, after the QLoRA training, I merged the adapter model with the base model, THEN pushed the result to the hub. So the model in the hub is a transformer model. Specifically, a transformers.models.llama.modeling_llama.LlamaForCausalLM
this is a colab notebook that can be used for testing. Note that the test works for the base model meta-llama/Llama-2-7b-chat-hf

My suspect is that the docker container behind the inference API does not know that it needs to install bitsandbytes. Is there a way to “tell it”? maybe a tag in the README?

Ubresearch · March 2, 2024, 4:02pm

Getting the same error, have you found anything related to the issue?

FDeRubeis · March 2, 2024, 4:16pm

Unfortunately not. Currently I just gave up But let me know if you find something.

Ubresearch · March 2, 2024, 5:28pm

yeah, same. I tried including requirements.txt, every pip install in that text file, but then too the same error. Don’t know how to figure it out.

dyrutter · March 9, 2024, 2:50pm

I have the same problem. Might this be an issue on the server side? I hope someone from HF looks into it.

sajed-apollo · March 10, 2024, 11:04am

facing the same issue here.

Nakul24 · March 17, 2024, 9:52pm

Did anyone find a solution for this?

rg1995007 · March 18, 2024, 2:27am

Same Issue

lukaskellerstein · March 18, 2024, 1:52pm

The same here. Any advice?

samchain · March 18, 2024, 7:53pm

Hello, I had a similar issue recently.

First, make sure to have a requirements file with the packages listed.
Also, check that you have a torch version compatible with bitsandbytes.

Finally, if working with docker, you can either specify it into the requirements file or directly in the docker file by writing something like:
RUN pip install -U bitsandbytes

You can check locally by building the image with docker run to check how the building goes.

xap · April 30, 2024, 2:27am

I have the same problem and haven’t found any solutions. Did anyone find any solutions? This is so frustrating

tropianhs · May 2, 2024, 4:19pm

I went through the exact same process (fine-tune, merge, push).
Same problem here and same error using inference API.
I have also tried to deploy an inference endpoint and getting another error.
Calling cuda()is not supported for4-bitor8-bit quantized models.

Not sure what is going on here.

mjng-intrinsic · May 13, 2024, 4:02pm

Same issue. Have a requirements.txt file installed as well.

intelpen · May 16, 2024, 4:20pm

Same issue , same message on inference API (No package metadata was found for bitsandbytes). Tried to add transformers in requirements.txt, not working neither

import torch
import transformers
model_id = “meta-llama/Meta-Llama-3-8B”

nf4_config = transformers.BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type=“nf4”,
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)

model_nf4 = transformers.AutoModelForCausalLM.from_pretrained(model_id,
quantization_config=nf4_config)
model_nf4.push_to_hub(“intelpen/Meta-llama-3-8B-BB4-v3”)

camillop · June 24, 2024, 7:47pm

same issue here as well

Topic		Replies	Views
Inference Api ( serverless ) Endpoint Inference Endpoints on the Hub	0	455	April 24, 2024
Issue in deploying quantized meta-llama/Llama-3.1-8B-Instruct in aws sagemaker Intermediate	0	70	October 10, 2024
Errors running Inference Endpoint with quantized model Inference Endpoints on the Hub	2	791	September 14, 2023
Llama 2 7B fine-tuned with IA3 errors when performing inference 🤗Transformers	2	646	January 16, 2024
Regarding Hugginface Inference API Not Working Beginners	0	187	May 15, 2024

Inference API for fine-tuned model not working: No package metadata was found for bitsandbytes

Related topics