How to calculate perplexity properly

Hey guys, i’m trying to evaluate my model through it’s perplexity on my test set and started to read this guide: Perplexity of fixed-length models — transformers 4.11.3 documentation

However, i don’t understand why joining our texts like this would not damage my models predictions:

from datasets import load_dataset
test = load_dataset('wikitext', 'wikitext-2-raw-v1', split='test')
encodings = tokenizer('\n\n'.join(test['text']), return_tensors='pt')

How can my model predict properly if my contexts are mixed up?
I’m dealing with short sentences tho

It allows the model to generalize across sentence or document boundaries, which is typically what you want in generative models. This is not a requirement, by the way, but combining it with a strided window this is quite powerful.

1 Like

Hey Bram! Thanks for your reply. I got it.