meta-llama/Llama-2-7b-hf with SFT zero loss when I increase my batch size

andreaKIM · March 22, 2024, 4:55am

Hello, When I try fine tuning llama2-7b model with SFT, I found zero loss when I set my train batch size as 16. However, When I change my batzh size into 2, training loss get proper values. What could be the reason? I have about 500 token lenghths data on training.
Thanks!

Topic		Replies	Views
Finetuning quantised llama-2 with LoRA Beginners	1	5608	September 23, 2023
Training llama2-7b-chat, is my model overfitting? i think my model is not learning anything? how to better train? Beginners	3	609	April 23, 2024
Llama 2 & 8K Training 🤗Transformers	0	726	August 4, 2023
Fine tune "meta-llama/Llama-2-7b-hf" Bug:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward) Beginners	15	182	December 6, 2024
Fine-tuning LLM for regression yields low loss during training but not in inference? 🤗Transformers	2	4489	March 4, 2024

meta-llama/Llama-2-7b-hf with SFT zero loss when I increase my batch size

Related topics