How to use Trainer when eval dataset has multiple references

adenhaus · September 22, 2023, 9:12pm

Hi

I want to fine-tune mT5 on the TaTA dataset. For most examples, the dataset has multiple references, so the target column is a list of possible references. For the training set, I can just explode these into their own rows and treat them as separate examples. But the Trainer expects an eval_dataset, but it can’t handle examples with multiple references. Is there a way I can do this?

Cheers

Topic		Replies	Views
Evaluating your model on more than one dataset Beginners	3	2073	February 28, 2022
What if I have more than one reference when doing generation finetune task Beginners	0	187	May 10, 2022
How to preprocess dataset with multiple references 🤗Datasets	5	306	July 31, 2023
Is Eval and Validation same in Trainer API? Beginners	4	1736	September 14, 2021
Trainer.evaluate() vs trainer.predict() 🤗Transformers	6	36473	July 10, 2024

How to use Trainer when eval dataset has multiple references

Related topics