I’m working on fine-tuning seq2seq models and having a question about the validation of the generated output. As far as I understand, when using beam search for text generation, it is possible to get a
sequences_scores metric from the
generate() function as described here. Is it right that this can be treated as the confidence of the model in the generated output and taking
sequences_scores would give me some sort of understandable value from which I can make conclusions about the quality of results?