Hello! I’m using T5-base for my model, and it seems to be generating something reasonable when I do model.generate
. But my question is how?
The decoder part of this model needs a starting token to start decoding doesn’t it? How does it figure out what the very first token is supposed to look like?
Or am I doing the training wrong where I should have included a token?