BERT next sentence prediction: bert-base always returns false

anentropic · April 26, 2023, 3:37pm

I am experimenting with the various BertFor* model classes with pre-trained models

Using bert-base-uncased with BertForNextSentencePrediction I have the following code which seems to work nicely:

from transformers import BertTokenizer, BertForNextSentencePrediction
import torch

# Load pre-trained tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForNextSentencePrediction.from_pretrained('bert-base-uncased').eval()

# Tokenize input sentences
sentence_a = "The cat sat on the mat."
sentence_b = "It was a nice day outside."
encoded = tokenizer.encode_plus(sentence_a, sentence_b, return_tensors='pt')

# Get model predictions
with torch.no_grad():
    outputs = model(**encoded)
logits = outputs[0]
probs = torch.softmax(logits, dim=1)
is_next_sentence = torch.argmax(probs, dim=1).bool().item()

is_next_sentence is either True or False

The only problem is that I can’t seem to find any pair of sentences which return True.

(Does anybody have any examples? Is the code above wrong?)

I am guessing that maybe the base model is just not able to do this task and I need to use a fine-tuned variant?

Does anyone have a bert-base variant on HF Hub that is fine-tuned on NSP?

anentropic · April 26, 2023, 4:11pm

Or… training my own would be a good exercise I guess

Can anyone recommend an HF dataset for NSP training?

anentropic · April 26, 2023, 4:15pm

I am starting to think maybe no one actually has a use for NSP inference… is this just a pre-training technique?

anentropic · April 26, 2023, 4:44pm

Turns out the code above is wrong somehow

With help from happy-transformer/happy_next_sentence.py at master · EricFillion/happy-transformer · GitHub I found working code:

encoded = tokenizer(sentence_a, sentence_b, return_tensors='pt')

with torch.no_grad():
    outputs = model(**encoded)

probabilities = torch.softmax(outputs[0], dim=1)
prediction = torch.argmax(probabilities)
# 0 = yes, 1 = no
is_next_sentence = prediction.item() == 0

The main difference I see in this code is using tokenizer(...) rather than tokenizer.encode_plus

Topic		Replies	Views
Next sentence prediction on custom model 🤗Transformers	3	3276	May 14, 2024
BERT Next Sentence Prediction: How to do predictions? Beginners	5	6552	September 29, 2022
Bert-base-uncased performs badly in next sentence prediction (bookcorpus) 🤗Transformers	0	314	June 2, 2023
Next sentence prediction with google/mobilebert-uncased producing massive, near-identical logits > 10^8 for its documentation example (and >2k others tried) 🤗Hub	1	795	October 19, 2021
Further train bert with next sentence prediction head using tensorflow 🤗Transformers	4	1541	July 1, 2021

BERT next sentence prediction: bert-base always returns false

Related topics