Trainer.evaluate() with text generation

cgawron · December 31, 2021, 9:42am

Is there any update regarding this topic?
I would like to train a VisionEncoderDecoderModel for image captioning and measure the BLEU metrics during evaluation. The EvalPrediction object I get in compute_metrics just contains the logits, not the generated texts or tokens (i.e. the result of a beam search). I would assume that the computation of metrics on the result of generate is not uncommon.

The PR mentioned in this thread seems to be stale and there have been quite some changes to Trainer since it was proposed.

Topic		Replies	Views
Evaluate model at saved checkpoint 🤗Transformers	0	1303	June 22, 2021
[Urgent] trainer.predict() and model.generate creates totally different predictions 🤗Transformers	4	6960	February 1, 2021
Using Trainer class with T5 - what is returned in EvalPrediction dict? 🤗Transformers	8	5349	February 14, 2022
Seq2seq evaluation speed is slow 🤗Transformers	7	3863	June 20, 2023
Evaluation results (metric) during training is different from the evaluation results at the end 🤗Transformers	4	3291	September 26, 2022

Trainer.evaluate() with text generation

Related topics