Llama2-7b-hf model not reproducible across runs

MaBrThesis2023 · February 18, 2024, 2:09pm

I am fine-tuning a Llama2-7b-hf model on my custom dataset. However, the train and eval loss is different any time a re-run the training with the HuggingFace Trainer. I set the seed prior model training using the set_seed function and also passed the seed as arg to the Trainer.

I tested the same code with the Mistral model and could not observe similar behavior. Any idea what can cause this difference?

zekeZZ · March 15, 2024, 4:44pm

I also see the same issue – I have set the random seed prior to model loading. This behavior is not observed on Phi in my case.

Update: seems like the cause is from flash attention

Topic		Replies	Views
Fixing the random seed in the Trainer does not produce the same results across runs 🤗Transformers	5	17562	March 27, 2025
Multiple training will give exactly the same result except for the first time 🤗Transformers	1	3558	July 19, 2021
Regarding the seed in HF trainer 🤗Transformers	0	317	June 14, 2022
Different BERT results Beginners	1	1179	May 25, 2022
Can not reproduce result when finetune BERT model 🤗Transformers	3	263	November 16, 2024

Llama2-7b-hf model not reproducible across runs

Related topics