Memory requierements

Hi, I am trying to fine-tune meta-llama/Llama-3.2-1B-Instruct. I loaded the model in 4-bit precision using the Transformers library and applying the LoRA method using the PEFT library and TRL. The issue comes when I start the training step, as I am permanently running out of memory, and I don’t know why. These are my training arguments:

training_args = SFTConfig(
    output_dir='/content/results',
    num_train_epochs=5,
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    gradient_accumulation_steps=2,
    learning_rate=2e-4,
    bf16=True,
    logging_steps=50,
    eval_strategy='steps',
    eval_steps=500,
    save_strategy="steps",
    save_steps=500,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir="/content/logs",
    packing=True,
    report_to="none"
)

trainer = SFTTrainer(
    model=model,
    train_dataset=templated_dataset['train'],
    eval_dataset=templated_dataset['test'],
    args=training_args,
    tokenizer=tokenizer,
    
)

The sequence length is 2048, and the parameters to train are 1,179,648 (LoRA). I calculated that I will need around 3.57GB, but with 15GB which I have, I am running out of memory. I don’t know if there is something wrong with my training arguments configurations. Can you help me, please? Thanks in advance.

1 Like