Evaluate Model on Test dataset (PPL)

Hi guys,

i am kinda new to hugginface and have a question regarding the PPL.
So what i have is, i fine-tuned a model and at the end of the traning i get the PPL
for the dev dataset by doing:

eval_results = trainer.evaluate()
print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")

However, in my DatasetsDict I have train, dev and test and also want to get the PPL
for the test set. Is there any simple way do get it done?

Best regards
Chris

Hey Chris, i’m a begginer as well but i think you could try using EvalPrediction. You pass your test_set and it’s labels and then you should get your loss at the specific set.

Hi @Felipehonorato ,

thanks for your response. I am not sure if i understood it correct.
In the doc you send it says it’s used to compute the metrics.
So if i get it right, it just returns an object of the class EvalPrediction.

However, what i would like to have is the loss itself.
I found i could use trainer.predict(test_dataset) but it always gives me and error:

untimeError: CUDA out of memory. Tried to allocate

So i don’t really get why trainer.evaluate() works but predict does not. Any ideas?

Also it would be useful if anyone could send me link where the difference between
trainer.predict() and trainer.prediction_loop() , .trainer.prediction_step() is explained in detail.

Did you try decreasing the batch size?