Hi, I am trying to fine-tune meta-llama/Llama-3.2-1B-Instruct. I loaded the model in 4-bit precision using the Transformers library and applying the LoRA method using the PEFT library and TRL. The issue comes when I start the training step, as I am permanently running out of memory, and I don’t know why. These are my training arguments:
training_args = SFTConfig(
output_dir='/content/results',
num_train_epochs=5,
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
gradient_accumulation_steps=2,
learning_rate=2e-4,
bf16=True,
logging_steps=50,
eval_strategy='steps',
eval_steps=500,
save_strategy="steps",
save_steps=500,
warmup_steps=100,
weight_decay=0.01,
logging_dir="/content/logs",
packing=True,
report_to="none"
)
trainer = SFTTrainer(
model=model,
train_dataset=templated_dataset['train'],
eval_dataset=templated_dataset['test'],
args=training_args,
tokenizer=tokenizer,
)
The sequence length is 2048, and the parameters to train are 1,179,648 (LoRA). I calculated that I will need around 3.57GB, but with 15GB which I have, I am running out of memory. I don’t know if there is something wrong with my training arguments configurations. Can you help me, please? Thanks in advance.