Hi, i am new to Transformers so this might be a dumb question. I am training to fine tune a bert model for MCQ, these are my training arguments:
TrainingArguments(**
{"warmup_ratio": 0.1,
"lr_scheduler_type": "cosine",
"optim": "adafactor",
"gradient_accumulation_steps": 1,
"learning_rate": 5e-4,
"per_device_train_batch_size": 2,
"per_device_eval_batch_size": 2,
"num_train_epochs": 2,
"output_dir": ".",
"overwrite_output_dir": true,
"load_best_model_at_end": true,
"fp16": true,
"seed": 42,
"save_total_limit": 1,
"logging_steps": 100,
"eval_steps": 100,
"evaluation_strategy": "steps",
"save_strategy": "steps",
"metric_for_best_model": "MAP@3",
"report_to": "wandb"}
)
Trainer(
model=model,
args=training_args,
tokenizer=tokenizer,
data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer),
train_dataset=tokenized_train,
eval_dataset=tokenized_test,
compute_metrics=compute_metrics,
callbacks = [EarlyStoppingCallback(early_stopping_patience=3)]
)
Where MAP@3 is a custom metric. What i noticed from my wandb logs is that the model loaded at the end has the second best MAP@3 on the test set, not first. This happened to me in 3 different runs. Any idea of why this is happening?
Thank you so much