How to convert model output logits into string sentences during training to check what the model is outputting?

BramVanroy · October 14, 2021, 7:55am

It seems to me that these are the outputs of the base model. To get token predictions, you need the output of the LMHead (often a linear projection of hidden_dim → vocab_size). If you have the logits of shape bs, seqlen, vocab_size, you can simply do a softmax on that last dimension, select top1, and decode. This is not as much manual work as you may expect, but you do need the outputs of the LMHead.

As you said, you’d typically generate with generate, so I am not sure whether I understand your use case.

Topic		Replies	Views
How can I obtain the logits via model.generate()? 🤗Transformers	2	3787	October 8, 2024
GPT-2 Logits to tokens for beam search (Generate method) 🤗Transformers	0	1324	September 2, 2021
Can I get logits for each sequence I acqired from model.generate()? Beginners	1	1375	November 27, 2020
Using model() instead of model.generate() 🤗Transformers	3	645	January 30, 2025
Logits from generate and model call different 🤗Transformers	2	1023	January 26, 2025

How to convert model output logits into string sentences during training to check what the model is outputting?

Related topics