- This is a great question and super confusing.
- You can get a good snapshot of my understanding here. I don’t think that will clear everything up, but if you could read that and let me know what you still don’t understand it would be helpful.
- If I encounter empirical evidence that I should change
decoder_start_token_id
for any model I will do so.