Parameters for evaluation loop of a Seq2SeqTrainer model

marshmellow77 · November 26, 2021, 1:20pm

Hi all, I’m training a summarization model with the Seq2SeqTrainer API. Once the model is trained I would like to generate summaries with specific values for top_k, top_n, temperature, etc.

The evaluate() method only allows for certain parameters to be passed on to the generate() method, namely max_length and num_beams, see here: transformers/trainer_seq2seq.py at master · huggingface/transformers · GitHub

I’m wondering whether this leads to a model that doesn’t perform well at the target task? I’m imagining that the model will be optimised for certain values of top_k, top_n, temperature which won’t be the same as the ones I would like to use.

(1) Is this a valid concern or am I overthinking this?
(2) If it is a valid concern → If I wanted to change the values for those parameters, how would I go about it? Would I have to change the configuration of the model before training, since the generate() function reads the values for these parameters from there (see transformers/generation_utils.py at c4d4e8bdbd25d9463d41de6398940329c89b7fb6 · huggingface/transformers · GitHub)?

Any insights appreciated

Cheers,
Heiko

Topic		Replies	Views
Trainer.evaluate() with text generation Beginners	5	3527	December 31, 2021
Evaluate model at saved checkpoint 🤗Transformers	0	1295	June 22, 2021
What does the output of Seq2SeqTrainer predict.predictions refer to and how to get generated summaries Beginners	4	1256	October 19, 2023
Generate 'continuation' for seq2seq models Intermediate	1	1862	February 22, 2021
Problem fine-tuning a model with Seq2Seq Trainer Beginners	1	993	June 25, 2023

Parameters for evaluation loop of a Seq2SeqTrainer model

Related topics