Hello,
I am currently trying to perform full fine tuning on the ai-forever/mGPT model (1.3B parameters) using a single A100 GPU (40GB VRAM) on Google Colab. However when running the training is very slow: ~0.06 it/s.
I was wondering whether this is the expected training speed or is there some issue with my code? And if it is an issue, what could a possible fix be?
Here is my code:
dataset = load_dataset("allenai/c4", "lt")
train_dataset = dataset["train"]
eval_dataset = dataset["validation"]
train_dataset = train_dataset.take(10000)
eval_dataset = eval_dataset.take(1000)
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = train_dataset,
eval_dataset = eval_dataset,
dataset_text_field = "text",
max_seq_length = 2048,
args = TrainingArguments(
gradient_accumulation_steps = 4,
gradient_checkpointing = True,
num_train_epochs = 3,
learning_rate = 2e-4,
per_device_train_batch_size = 4,
per_device_eval_batch_size = 4,
seed = 99,
output_dir = "./checkpoints",
save_strategy = "steps",
eval_strategy = "steps",
save_steps = 0.1,
eval_steps = 0.1,
logging_steps = 0.1,
load_best_model_at_end = True
),
)
trainer_stats = trainer.train()
And the trainer output:
It says it will take ~10hrs to process 10k examples from the c4 dataset. Is this normal?