Model results differ after creating pipeline with same model

Im using a pre-trained tokeniser and a fine-tuned model.
With the same model, when I create pipeline, my results differ both the time.

tokenizer = AutoTokenizer.from_pretrained("huggingface/CodeBERTa-small-v1")
mymodelclf  = TextClassificationPipeline(model=model, tokenizer = tokenizer,return_all_scores = True )
print (mymodelclf( negative_samples ) ) 

The same code is run twice, but 2 different results are obtained. Is this expected?

1 Like

Are you sure your model is in evaluation mode?

1 Like

Oh thank you. Apologies I did not look into the examples well enough and did not set eval mode.
A google search shows that eval mode is necessary for reproducible results.
Im getting the expected results now. Thank you very much for the quick help.

1 Like


I got the same issue using deepset/gelectra-base-germanquad.
However, I did not quite understand what is meant by eval mode.
Could you give a example, so I can understand what is meant by that?