As was requested in #5226, model outputs are now more informative than just plain tuples (without breaking changes); PyTorch models now return a subclass of ModelOutput
that is appropriate. Here is an example on a base model:
from transformers import BertTokenizer, BertForSequenceClassification
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1
outputs = model(**inputs, labels=labels)
Then outputs
will be an SequenceClassifierOutput
object, which has the returned elements as attributes. The old syntax
loss, logits = outputs[:2]
will still work, but you can also do
loss = outputs.loss
logits = outputs.logits
or also
loss = outputs["loss"]
logits = outputs["logits"]
Under the hood, outputs
is a dataclass with optional fields that may be set to None
if they are not returned by the model (like attentions
in our example). If you index by integer or by slice, the None
fields are skipped (for backward-compatibility). If you try to access an attribute that’s set to None
by its key (for instance here outputs["attentions"]
), it will return an error.
You can convert those outputs
to a regular tuple/dict with outputs.to_tuple()
or outputs.to_dict()
.
You can revert to the old behavior of having tuple by setting return_tuple=True
in the config you pass to your model, or when you instantiate your model, or when you call your model on some inputs. If you’re using torchscript
(and the config you passed to your model has config.torchscript = True
) this will automatically be the case (because jit only handles tuples as outputs).
Hope you like this new feature!