Getting the log probability of a sentence with BERT

fmorales · November 7, 2021, 9:50pm

Hi all,

I recently came across LM-critic, which main idea is to assess the grammaticality of two similar sentences. Since LM-critic uses Huggingface GPT2LMHeadModel, I decided to experiment with BertLMHeadModel instead, but the results are very poor (~60%) compared to those of GPT2 (~90%).

Without going deeper into the details of my comparison (I’m planning to share a link with the code soon), I was wondering if the reason behind BERT’s poor performance in this task could be explained by the different training objectives of these two models.

I’ll be happy to read your thoughts.

Topic		Replies	Views
Questions on the `BertModelLMHeadModel` 🤗Transformers	7	6240	October 5, 2020
Using BERT and RoBERTa for (causal?) language modeling 🤗Transformers	6	5341	October 2, 2021
LM few shot and fine tuning on summarization task Beginners	1	1267	July 19, 2024
Getting unexpected results for fine tuned bert model Beginners	0	271	February 9, 2024
How to interpret logit score from Hugging face binary classification model and convert it to probability sore Models	0	1518	December 20, 2021

Getting the log probability of a sentence with BERT

Related topics