Hey guys, i’m trying to evaluate my model through it’s perplexity on my test set and started to read this guide: Perplexity of fixed-length models — transformers 4.11.3 documentation
However, i don’t understand why joining our texts like this would not damage my models predictions:
from datasets import load_dataset
test = load_dataset('wikitext', 'wikitext-2-raw-v1', split='test')
encodings = tokenizer('\n\n'.join(test['text']), return_tensors='pt')
How can my model predict properly if my contexts are mixed up?
I’m dealing with short sentences tho