Perplexity Calculation in run_clm.py

nghosh · May 23, 2024, 11:32pm

I am trying to evaluate the perplexity of a model on WikiText-2.

The three code sources I am using are:

For example, when evaluation Llama-2 13b I get the following respective perplexities (using seq_length 1024)

Does anyone know why the value obtained from 1. is significantly different from the other values?

Topic		Replies	Views
Huge discrepancy in perplexity of LLM for Trainer v/s scratch implementation? Beginners	1	142	October 24, 2024
Useful compute_metrics functions for perplexity 🤗Transformers	0	651	September 29, 2022
Trainer.evaluate() 🤗Transformers	3	6889	May 11, 2021
Log Perplexity using Trainer 🤗Transformers	2	2004	October 9, 2021
Metrics for masked language modeling (mlm) Beginners	0	529	September 16, 2021