Hi @Abe, if I understand correctly you’d like to go from an input string like “I love this movie!” to a set of predicted labels and their confidence scores (i.e. probabilities).
The simplest way to achieve that would be to wrap your model and tokenizer in a TextClassificationPipeline
with return_all_scores=True
:
from transformers import TextClassificationPipeline
model = ...
tokenizer = ...
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)
# outputs a list of dicts like [[{'label': 'NEGATIVE', 'score': 0.0001223755971295759}, {'label': 'POSITIVE', 'score': 0.9998776316642761}]]
pipe("I love this movie!")
The above also works for multiple inputs by feed a list of examples instead of a single string:
pipe(["I love this movie!", "I hate this movie!"])
If you want to have human-readable labels like “positive” and “negative” you can configure the id2label
and label2id
attributes of your model’s config class: Change label names on inference API - #3 by lewtun
HTH!