Hi, newbie here, I’m fine-tuning Roberta-base with the code I’ve attached below and have some questions:
- Have I understood it correctly that the training process used here will contaminate dataset used for evaluation? Or could the validation data here be considered test-data, and I could simply do an 80/20 split? I’ve read I need to use a validation set when doing hyperparameter tuning, is such tuning done behind the scene (when calculating loss?)
- When I run trainer.evaluate, will it automatically use the evaluation dataset? For final testing, should I specify the last part of the dataset, in this case, split='train[90%:]
A lot of tutorials called the evaluation dataset “test-data”, which made me a bit confused. Few tutorials also go through the process of first validating, then testing.
train_data = datasets.load_dataset('csv', data_files = 'datasets/all_shuffled.csv', split='train[:80%]')
vali_data = datasets.load_dataset('csv', data_files = 'datasets/all_shuffled.csv', split='train[80%:90%]')
training_args = TrainingArguments(
output_dir = 'roberta',
num_train_epochs=4,
per_device_train_batch_size = 4,
gradient_accumulation_steps = 16,
per_device_eval_batch_size= 8,
evaluation_strategy = 'no',
save_strategy = 'no',
disable_tqdm = False,
load_best_model_at_end=True,
warmup_steps=500,
weight_decay=0.01,
logging_steps = 8,
fp16 = False,
logging_dir='roberta/logs',
dataloader_num_workers = 8,
run_name = 'roberta-classification'
)
trainer = Trainer(
model=model,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=train_data,
eval_dataset=vali_data
)
trainer.train()
trainer.evaluate()