Slightly different output from trainer.predict and pipeline(..., function_to_apply="none")

OskarLiew · August 18, 2022, 11:40am

Hello,

I have a trained multi-label text classifier that I want to use for inference. I loaded it into a pipeline("text-classification", "./model"). When calculating performance in the downstream application I noticed how the performance metrics were noticably worse than those I obtained during training (both calculated on the same held-out test set).

Here is a histogram or the difference in outputs between the models:

Only about 1/4th of the measurements in the 0-bar are exactly equal to 0.

I have been scratching my head about this thing for the whole day and any help would be greatly appreciated

Edit: I tested running the pipeline on the GPU and I get the same results. I also checked the parameters of the models in both the trainer and pipeline and they are identical.

alexgre · June 21, 2023, 4:15pm

Had the same problem with token classification pipeline. Did you find the reason?

Topic		Replies	Views
Different outputs when using pipeline Intermediate	2	1230	July 20, 2023
Pipeline very slow 🤗Transformers	1	4335	May 5, 2023
Model results differ after creating pipeline with same model Beginners	2	862	September 30, 2020
Completely different results for model in pipeline and by itself Beginners	2	1626	February 23, 2024
Sentiment Analysis Pipeline on single label function_to_apply not working 🤗Transformers	1	1021	March 17, 2022

Slightly different output from trainer.predict and pipeline(..., function_to_apply="none")

Related topics