I want to use “facebook/bart-large-mnli” model for NLI task.
I have dataset with premises and hypothesis columns and labels [0,1,2].
How can I use this model for that NLI task ?
I wrote the following code:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
nli_model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')
nli_model.to(device)
i = 0 # first examle check
premise = tokenized_datasets['TRAIN'][i]['premise']
hypothesis = tokenized_datasets['TRAIN'][i]['hypothesis']
x = tokenizer.encode(premise, hypothesis, return_tensors='pt', truncation_strategy='only_first')
logits = nli_model(x.to(device))[0]
entail_contradiction_logits = logits[:,[0,2]]
probs = entail_contradiction_logits.softmax(dim=1)
probs
and I got only 2 values: tensor([[8.8793e-05, 9.9991e-01]], device='cuda:0', grad_fn=<SoftmaxBackward0>
) (instead of 3 values - contradiction, neutral, entailment)
How can I use this model for NLI (predict the right value from 3 labels) ?