I’ve been working on research related to the Commonsense Explanations Dataset cos-e which uses the transformers version with commit id e14c6b52e37876ee642ffde49367c51b0d374f41. I decided to update the version of the library that the salesforce code uses to the latest version of transformers and ran into an issue where the perplexity of generated text from GPT was massive.
Long story short, I realized that the older version of the library didn’t implicitly shift the language modeling labels (for the causal language modeling that GPT uses). As a result, after migrating to the new transformers library, my labels were all shifted over one. Would be helpful if this was added to the migration documentation here.