Is Eval and Validation same in Trainer API?

moma1820 · September 13, 2021, 7:01pm

Hi, I am a bit confused if the eval dataset parameter is used during the training.

#Trainer itself.
trainer = Trainer(
    model,
    args,
    train_dataset=tokenized_datasets_train,
    eval_dataset=tokenized_datasets_val,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    data_collator = data_collator_
)

is the eval_dataset only used when we do trainer.evaluate() ?

sgugger · September 14, 2021, 12:42am

Yes, it’s the default dataset used for that method (which will be used if you pass an eval_strategy to evaluate every epoch or n steps).

moma1820 · September 14, 2021, 8:16am

Okay perfect, then i will not put my test set over there Thanks! @sgugger

Although i want to get the f1 score of my model on the test set, do you know if there is a metric api from hugginface i could use?

If so could you please link me a small script?

sgugger · September 14, 2021, 12:12pm

You can look at any of the examples or the course section on the Trainer on how to du this using a compute_metrics function.

moma1820 · September 14, 2021, 12:36pm

Huge thanks again!

Have a nice day

Topic		Replies	Views
Technical clarification on the validation data vs. the training data in the trainer API 🤗Transformers	1	756	January 6, 2022
Trainer.evaluate() vs trainer.predict() 🤗Transformers	6	36438	July 10, 2024
Why eval_dataset is set with test dataset in training_args Beginners	0	109	April 1, 2024
2 evalset but got no validation loss Beginners	4	297	April 17, 2024
How to use the test set in those beginner examples? Beginners	1	702	October 18, 2021

Is Eval and Validation same in Trainer API?

Related topics