Trainer output much better than output from loaded model

egalinkin · January 31, 2022, 4:47pm

I fine-tuned a model using HF, Datasets, and trainer. However, I’m now running into a kind of curious issue: my output from the QA trainer is excellent but loading the trained model into a pipeline gives me terrible results.
During training, I got a Rouge2 F1 of around .89, checkpointed the model, loaded from the best checkpoint and saved the model file. Now when I run the loaded model with the same tokenizer in a pipeline, I get terrible performance.

I used the QA Trainer from HF’s repo with very minor modifications – any thoughts?

Topic		Replies	Views
Load custom model trained with trainer Beginners	0	225	August 31, 2023
Autogenerated model cards not showing the best metrics when using "load_best_model_at_end=True" 🤗Hub	0	531	December 24, 2022
How to load metrics in HF Trainer for the best model when `load_best_model_at_end=true`? 🤗Transformers	0	740	November 4, 2021
Slightly different output from trainer.predict and pipeline(..., function_to_apply="none") Beginners	1	503	June 21, 2023
Is HF Trainer checkpointing usable? Community Calls	0	22	September 5, 2024

Trainer output much better than output from loaded model

Related topics