Hi guys! First of all, i’d like to thank you regarding your hard work you do for us all.
But here’s the issue:
I’m trying to replicate results of the xlm-roberta-large-xnli model of joeddav, obtained within the HuggingFace Model Page, but it seems impossible: values i get through my python code are always different from the above-mentioned ones. I also tried his notebook, but they’re different too.
It’s not worth mentioning the specific Sentence and Labels, since ANY string and labels, in any language, i tried, got different results from that page.
I even tried to use hypotesis_template
which increased the values difference even more.
Here’s the code i used:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import pipeline
import torch
device = 0 if torch.cuda.is_available() else -1
tokenizer = AutoTokenizer.from_pretrained("joeddav/xlm-roberta-large-xnli")
model = AutoModelForSequenceClassification.from_pretrained("joeddav/xlm-roberta-large-xnli")
sequence_to_classify = "Con la mia macchina del caffe e la capsula prima esce solo acqua, poi si sente che la capsula viene bucata e infine esce il caffè. Ma intanto l' acqua è nella tazzina. Voglio un rimborso per tutti i soldi che ho speso."
candidate_labels = ["tecnologia", "cibo", "bevande", "finanza", "cinema", "giochi"]
classifier = pipeline("zero-shot-classification",
model=model, tokenizer=tokenizer, device=device)
#hypothesis_template = "This text is about {}."
classifier(sequence_to_classify, candidate_labels, hypothesis_template, multi_class=False)