SFTTrainer too slow during the build (or ingestion) phase

I am using unsloth to fine-tune llama 3.1 8b on colab using the Eli5 dataset. The trainer is way too slow to even build. This should be done in a couple minutes but it is taking hours. Here is my SFTTrainer code:

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=ds,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    packing=False,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        warmup_steps=10,
        max_steps=100,
        learning_rate=1e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=1,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
    ),
)

I am getting 15 examples/sec speed and I have not even started the training process yet.
Screenshot 2024-11-27 at 3.15.16 PM

1 Like