Chapter 7 questions

MaBrThesis2023 · February 16, 2024, 3:11pm

I have a question regarding the fine-tuning of a causalLM model (Llama2) with the trainer. I set the seed 42 but the train and evaluation loss differ across runs when fine-tuning Llama2 on my dataset. This behavior is only observable with Llama2 (I tried it with Mistral and there was the loss always the same). May this problem be related to the model.generation_config file? is there any non-deterministic behavior within the train loop?

Topic		Replies	Views
Chapter 3 questions Course	151	10654	October 6, 2025
Fine Tuning IMDb tutorial - Unable to reproduce and adapt Beginners	19	8606	August 21, 2020
Transformers v3.0.0 is out! 🤗Transformers	0	1953	July 7, 2020
Seq2SeqTrainer: enabled must be a bool (got NoneType) 🤗Transformers	15	3972	December 5, 2022
Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering 🤗Transformers	19	12953	February 12, 2024

Chapter 7 questions

Related topics