How to convert model output logits into string sentences during training to check what the model is outputting?

RylanSchaeffer · October 14, 2021, 3:58pm

@BramVanroy thank you for getting back to me!!

The comment # shape: (batch size, sequence length, hidden dimension e.g. 768) is incorrect. Sorry about that - that’s my bad. One of the fields, the hidden states, has that shape, but the logits have shape # shape: (batch size, sequence length, vocab size e.g. 50000)

You’re right that I should’ve been more clear with my use case. What I want is to, during training, to log pairs of (the sentences that go into the model, the sentences that the model generates). Now, the model is outputting logits over words, so there’s not immediately a notion of “sentences that the model generates.”

Two approaches might be to take the logits, convert them to a distribution and (1) take argmax or (2) sample from the distribution. Maybe others have thought of much more clever approaches. My question is intended to learn what HuggingFace recommends.

Perhaps you could also confirm that the .generate() method isn’t relevant for converting the model’s output logits into string sentences during training?

Topic		Replies	Views
How can I obtain the logits via model.generate()? 🤗Transformers	2	3787	October 8, 2024
GPT-2 Logits to tokens for beam search (Generate method) 🤗Transformers	0	1324	September 2, 2021
Can I get logits for each sequence I acqired from model.generate()? Beginners	1	1375	November 27, 2020
Using model() instead of model.generate() 🤗Transformers	3	645	January 30, 2025
Logits from generate and model call different 🤗Transformers	2	1023	January 26, 2025

How to convert model output logits into string sentences during training to check what the model is outputting?

Related topics