Evaluate Model on Test dataset (PPL)

Hi guys,

i am kinda new to hugginface and have a question regarding the PPL.
So what i have is, i fine-tuned a model and at the end of the traning i get the PPL
for the dev dataset by doing:

eval_results = trainer.evaluate()
print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")

However, in my DatasetsDict I have train, dev and test and also want to get the PPL
for the test set. Is there any simple way do get it done?

Best regards

Hey Chris, i’m a begginer as well but i think you could try using EvalPrediction. You pass your test_set and it’s labels and then you should get your loss at the specific set.

Hi @Felipehonorato ,

thanks for your response. I am not sure if i understood it correct.
In the doc you send it says it’s used to compute the metrics.
So if i get it right, it just returns an object of the class EvalPrediction.

However, what i would like to have is the loss itself.
I found i could use trainer.predict(test_dataset) but it always gives me and error:

untimeError: CUDA out of memory. Tried to allocate

So i don’t really get why trainer.evaluate() works but predict does not. Any ideas?

Also it would be useful if anyone could send me link where the difference between
trainer.predict() and trainer.prediction_loop() , .trainer.prediction_step() is explained in detail.

Did you try decreasing the batch size?