Use Trainer API with two valiation sets

simonschoe · March 30, 2021, 12:41pm

Hi everyone,
right now the Trainer API accepts one eval_dataset. I am wondering, is it somehow possible to provide two different validation sets that are both evaluated during training? For example, I might want to track my validation loss on a validation set that was previously sampled from my training data and hence shares the same distribution and on a validation set that was sampled from a data set with a presumably different data distribution (e.g., stemming from a different period). The idea stems from “Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks”.

Thanks in advance
Simon

deathcrush · February 28, 2022, 12:11pm

@patrickvonplaten this a common use case in research (including in mine). I don’t see any examples in the docs, but the docs (Callbacks) suggest that there should be a way to write a callback function to achieve this? Looking at the code very briefly, I imagine that a callback would simply call the evaluate again for all other eval_datasets at the end of the validation loop, optionally changing metric_key_prefix so that the logger displays traces of the metrics measured on individual datasets separately?

I’d have to look in more detail at the APIs for implementing a callback to provide more details, but I’m thinking.

UPDATE: I had a look but I can’t see how I could pass the other datasets to the callback, so I assume this is not possible. To promote a productive discussion, I raised a feature request where I summarise the issue and a potential workaround here.

Topic		Replies	Views
Evaluating your model on more than one dataset Beginners	3	2068	February 28, 2022
2 evalset but got no validation loss Beginners	4	297	April 17, 2024
Is Eval and Validation same in Trainer API? Beginners	4	1734	September 14, 2021
Technical clarification on the validation data vs. the training data in the trainer API 🤗Transformers	1	753	January 6, 2022
Trainer API to log both Training and Validation Metrics 🤗Transformers	2	1678	July 1, 2021

Use Trainer API with two valiation sets

Related topics