BERT and RoBERTA giving same outputs

theguywithblacktie · September 22, 2021, 10:48am

Hi All.

I tried using Roberta model in two different models. In both these models, I’ve faced same problem of getting same output for different test input during evaluation process.

Earlier, I thought it might be due to some implementation problem and hence I took a small dataset to overfit the dataset and predict the outputs for the same. I still got the same problem. Roberta was still giving out same output for different records.

I replaced Roberta with Bert and still got same issue.

Is there any bug in latest transformer version i.e. 4.10.2 (which I’m surely believe is very unlikely) or do have any other suggestion that I can try? I’ve used 4.2.1 version of transformer earlier and didn’t face this problem.

Also, I keep getting this warning while training and evaluation:

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

I checked online and I suspect this is not a issue but I am still not sure what it actually means. Could this be any issue?

Chetan06 · January 4, 2022, 4:23am

@theguywithblacktie have you figured out what was wrong with your code as I am facing the same issue while using roberta from transformer library.

Atharva · January 29, 2022, 5:32am

Any solutions?

veronica320 · July 5, 2022, 11:02pm

Having the same issue +1

cspartalis · July 31, 2022, 12:44pm

If I understood the problem correctly, the fine-tuned model always outputs the same value (e.g. the same class for a classification task). I had the same issue and tried to overcome it by tuning hyperparameters. The only hyperparameter that worked was the number of epochs.

I hope it helps,
Christoforos Spartalis

felixckyeung · April 4, 2024, 8:20pm

Anyone fix this?

swtb · April 9, 2024, 9:27am

Its difficult to diagnose the issue without seeing your training script.

Topic		Replies	Views
Fionetune model always predicts same output class for new data Models	7	2866	June 19, 2024
Roberta giving same output during evaluation 🤗Transformers	0	561	January 2, 2022
Implementation difference between Bert and Roberta ForSequenceClassification? 🤗Transformers	0	558	June 24, 2021
Architecture attribute of model.config is different from the actual model's architecture in RoBERTa 🤗Transformers	1	974	May 19, 2021
Inconsistencies between BERT and RoBERTa: what am I doing wrong? Beginners	0	360	May 11, 2022

BERT and RoBERTA giving same outputs

Related topics