Hi all, I’m training a summarization model with the Seq2SeqTrainer API. Once the model is trained I would like to generate summaries with specific values for top_k, top_n, temperature
, etc.
The evaluate()
method only allows for certain parameters to be passed on to the generate()
method, namely max_length
and num_beams
, see here: transformers/trainer_seq2seq.py at master · huggingface/transformers · GitHub
I’m wondering whether this leads to a model that doesn’t perform well at the target task? I’m imagining that the model will be optimised for certain values of top_k, top_n, temperature
which won’t be the same as the ones I would like to use.
(1) Is this a valid concern or am I overthinking this?
(2) If it is a valid concern → If I wanted to change the values for those parameters, how would I go about it? Would I have to change the configuration of the model before training, since the generate()
function reads the values for these parameters from there (see transformers/generation_utils.py at c4d4e8bdbd25d9463d41de6398940329c89b7fb6 · huggingface/transformers · GitHub)?
Any insights appreciated
Cheers,
Heiko