Different outputs when using pipeline

Hi folks!

I’m currently working on my master thesis and need to implement a text-classification model. Therefore I’m fine-tuning a base BERT model on my data. So far so good. To evaluate my fine-tuned model, I’m passing my test data to it for predictions. For this purpose I’m using the “pipeline” API provided by huggingface with a sigmoid function (since I’m facing a multi-label classification problem). It does work and I get as an output the probs for each class. Now if instead of using the pipeline I tokenize and pass the test data directly to the model (i.e. model(tokenized_data)) I get the logits for each class back. But if I transform these logits to probs with a sigmoid function, then I get probs which are completely different from the one outputted by the pipeline. Is this somehow possible?

By the way I double-checked everything (tokenized inputs, parameters, model loading etc.). I think I am missing something, but I don’t know what. Any help is pretty much appreciated!!

Thanks in advance!

Jack

I am facing the same issue that you describe here. Did you figure out what the problem was? I’ve even run the pipeline without activation function and the output is still different from what I get using trainer.predict

I had the same issue. I forgot to include problem_type in the model config. Added “multi_label_classification” and it worked fine.