I have trained a GPT-2 model on my dataset. When I generate text from it, sometimes it would generate text like “�����������������������”. Going on and on. Some other times, it will generate normal text but then it will start generating just “�” characters. Does anybody have a clue why it is doing this?
Problem was that when training using TPUs, weights for input embedding layer and output embedding doesnt get tied automatically. When using the HuggingFace Trainer API, it does, but when writing a custom training script you will have to write code to manually do it.