Evaluating your model on more than one dataset


Transformer’s Trainer and Trainingarguments classes allow for only one dataset to use for evaluation. Is there a simple way of adding another one? So, after after an epoch of training my model I could evaluate it on both training and developmental datasets and get metrics for both of them as one output? I know I could alter the training_args.py or trainer.py but I am pretty sure I would only mess things up…

I think the easiest way to do this is to use the new system of TrainerCallback and write a callback that performs a new evaluation on your other datasets during the event on_validate.

Is it possible to provide an example?