Technical clarification on the validation data vs. the training data in the trainer API

olaffson · January 5, 2022, 6:50pm

Hi,

I am using the Trainer API to fine-tune my models but I realized I wanted to clarify something about the training and the evaluation datasets as they appear in

trainer = Trainer(
      model,
      args,
      train_dataset=tokenized_datasets['train'],
      eval_dataset=tokenized_datasets['test'],
      tokenizer=tokenizer,
      compute_metrics=compute_metrics
  )

My understand is the following:

For each batch of data from the training data,

The loss is computed ONLY for that batch. Then a gradient descent (or other algorithm) will tweak the current parameters to make the loss smaller at the next iteration (batch).
move to the next batch of the traning data
Then, at the end of the epoch, the current model (with the new weights from 1. and 2. repeated for the current epoch) is applied to the full eval_dataset, predictions are computed and accuracy metrics (say “accuracy” or “precision”) are shown in the console.

In other words, the eval_dataset is NEVER used for training. Its only purpose is to provide (at the cost of “consuming” some of the data) a rough measure of the out of sample error rate. Training only stops when the number of epochs have been consumed.

Is that 100% correct?
Thanks!

olaffson · January 6, 2022, 7:07pm

of course, I know that usually the validation data is not used for training. I want to be sure that this is the case here as well. Using the Trainer API is a bit more opaque than using my own splits…

Any clarification would be greatly welcome! Thanks

Topic		Replies	Views
Is Eval and Validation same in Trainer API? Beginners	4	1774	September 14, 2021
Use Trainer API with two valiation sets 🤗Transformers	1	1919	February 28, 2022
Type of dataset in Trainer class Beginners	3	2494	July 20, 2020
Validation VS Test with Transformers Trainer Beginners	2	6497	June 6, 2022
Trainer API to log both Training and Validation Metrics 🤗Transformers	2	1724	July 1, 2021

Technical clarification on the validation data vs. the training data in the trainer API

Related topics