Tensor not on the same device after using BitsandBytesConfig

Hello, I am a beginner to llm and I was trying to use Qlora to fine tune the llama3-7B model. I was dealing with a text classification problem, so I used AutoModelForSequenceClassification(model_name, quantization_config=bnb_config, num_labels=100), and I use deepspeed zero2 to reduce the memory used in each gpu. When I removed the quantization_config in AutoModelForSequenceClassification, the code runs perfectly, but when I add this parameter, the error goes like: Expected all tensors on the same device. Does anyone know what is going on there? Thanks for your time!