How to calculate perplexity properly

Felipehonorato · October 27, 2021, 12:26pm

Hey guys, i’m trying to evaluate my model through it’s perplexity on my test set and started to read this guide: Perplexity of fixed-length models — transformers 4.11.3 documentation

However, i don’t understand why joining our texts like this would not damage my models predictions:

from datasets import load_dataset
test = load_dataset('wikitext', 'wikitext-2-raw-v1', split='test')
encodings = tokenizer('\n\n'.join(test['text']), return_tensors='pt')

How can my model predict properly if my contexts are mixed up?
I’m dealing with short sentences tho

BramVanroy · October 27, 2021, 12:38pm

It allows the model to generalize across sentence or document boundaries, which is typically what you want in generative models. This is not a requirement, by the way, but combining it with a strided window this is quite powerful.

Felipehonorato · October 27, 2021, 1:17pm

Hey Bram! Thanks for your reply. I got it.

Topic		Replies	Views
Why is perplexity calculation giving different results for the same input? 🤗Transformers	0	544	May 6, 2023
Guide: The best way to calculate the perplexity of fixed-length models Research	9	9536	December 16, 2021
Huge discrepancy in perplexity of LLM for Trainer v/s scratch implementation? Beginners	1	133	October 24, 2024
T5ForConditionalGeneration, How to get prediction probabilities or logits at the inference time? (to calculate perplexity) 🤗Transformers	0	691	April 5, 2022
Perplexity randomly failing due to missing cache file Beginners	1	585	January 27, 2024

How to calculate perplexity properly

Related topics