Hello everyone😊,
I’d like to test the model on the free CPU environment—do you have any suggestions?
I’m encountering an error when trying to deploy the Qwen1.5-0.5B-Chat model in my Hugging Face Space running on CPU-only (free) .
MyQwen1.5 0.5B Chat - a Hugging Face Space by funme
Thank you
Here the full log: tokenizer_config.json: 0%| | 0.00/1.29k [00:00<?, ?B/s]
tokenizer_config.json: 100%|██████████| 1.29k/1.29k [00:00<00:00, 7.24MB/s]
vocab.json: 0%| | 0.00/2.78M [00:00<?, ?B/s]
vocab.json: 100%|██████████| 2.78M/2.78M [00:00<00:00, 27.1MB/s]
merges.txt: 0%| | 0.00/1.67M [00:00<?, ?B/s]
merges.txt: 100%|██████████| 1.67M/1.67M [00:00<00:00, 31.1MB/s]
tokenizer.json: 0%| | 0.00/7.03M [00:00<?, ?B/s]
tokenizer.json: 100%|██████████| 7.03M/7.03M [00:00<00:00, 58.3MB/s]
config.json: 0%| | 0.00/1.26k [00:00<?, ?B/s]
config.json: 100%|██████████| 1.26k/1.26k [00:00<00:00, 7.28MB/s]
Traceback (most recent call last):
File “/home/user/app/app.py”, line 9, in
model = AutoModelForCausalLM.from_pretrained(
File “/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py”, line 571, in from_pretrained
return model_class.from_pretrained(
File “/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py”, line 309, in _wrapper
return func(*args, **kwargs)
File “/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py”, line 4389, in from_pretrained
hf_quantizer.validate_environment(
File “/usr/local/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py”, line 65, in validate_environment
raise RuntimeError(“GPU is required to quantize or run quantize model.”)
RuntimeError: GPU is required to quantize or run quantize model.