Error in Autotrain Training

Hello everyone I am very new and im experimenting with the Huggingface Autotrain UI but im having a little trouble getting the training started. I am trying to train a meta-llama/Llama-3.1-8b-Instruct Model with an example dataset that i found
alpaca1k.csv
which i uploaded as a local file.
I have not made any changes to any other parameters. When i then click start training i get an error.

ERROR | 2025-05-08 07:39:20 | autotrain.trainers.common:wrapper:215 - train has failed due to an exception: Traceback (most recent call last):
File β€œ/app/env/lib/python3.10/site-packages/autotrain/trainers/common.py”, line 212, in wrapper
return func(*args, **kwargs)
File β€œ/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/main.py”, line 28, in train
train_sft(config)
File β€œ/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/train_clm_sft.py”, line 27, in train
model = utils.get_model(config, tokenizer)
File β€œ/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/utils.py”, line 943, in get_model
model = AutoModelForCausalLM.from_pretrained(
File β€œ/app/env/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py”, line 564, in from_pretrained
return model_class.from_pretrained(
File β€œ/app/env/lib/python3.10/site-packages/transformers/modeling_utils.py”, line 3620, in from_pretrained
hf_quantizer.validate_environment(
File β€œ/app/env/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py”, line 83, in validate_environment
validate_bnb_backend_availability(raise_exception=True)
File β€œ/app/env/lib/python3.10/site-packages/transformers/integrations/bitsandbytes.py”, line 559, in validate_bnb_backend_availability
return _validate_bnb_cuda_backend_availability(raise_exception)
File β€œ/app/env/lib/python3.10/site-packages/transformers/integrations/bitsandbytes.py”, line 537, in _validate_bnb_cuda_backend_availability
raise RuntimeError(log_msg)
RuntimeError: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at Installation Guide

ERROR | 2025-05-08 07:39:20 | autotrain.trainers.common:wrapper:216 - CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at Installation Guide
INFO | 2025-05-08 07:39:20 | autotrain.trainers.common:pause_space:156 - Pausing space…

I not sure how i can fix this. Any help is appreciated

1 Like

In some cases, the problem can be resolved by installing bitsandbytes as indicated in the error message. However, in other cases, reinstalling PyTorch and the CUDA Toolkit may be necessary.

I found a solution by myself. Im using the free plan to there is only cpu to use and no gpu. I had to change some of the parameters. This is what i did for anyone who is wondering
Distributed Backend from ddp to deepspeed
Mixed precision from fp16 to none
PEFT/LoRA from true to false

Im not exactly sure what did the trick but its training now

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.