brando
11
Yes this is what I was going to do because I’m doing fine-tuning for code where syntax matters.
But I need the code. I’ve not had time to write it down. When I do I will share here. To clarify this is what I plan to do:
In the collate function for all seqs in the batch switch the final mask to 1 where the first EOS token is at.