Training stops while fine-tuning Llama2-7B with AutoTrain Advancedvanced

Hello all, I am in the process of fine-tuning Llama2-7b-hf using HuggingFace’s AutoTrain Advanced for a text-to-text generation problem.

Each time the project is correctly created indicating that the input data is somewhat okay (I followed the guidelines about dataset format).
However once that’s done, the AutoTrain process transitions from “Queued” to “Starting” to “Processing” to “Training” but just a few seconds into the “Training” step… it abruptly stops without providing any explanation or log whatsoever.

Could someone please guide me in the right direction?
I’ve been trying to resolve this issue for almost a week now, and I’m running out of ideas.
Is there a way to know more about what is going wrong ? Has anyone else encountered a similar issue before ?
Any assistance would be immensely appreciated ! Thank you :slight_smile: