Here’s I tried to finetune gpt2
but after few iterations I started to get same sequences, for ex.
prompt: Prime number is
model output: <|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>
By the way, my text divided into sentences without [EOS] tokens, in other words every item is separate sentence
upd:
It’s reproducible even without finetuning
one top screenshot - correct output
on bottom - eos sequence
But code generally the same