How to structure labels for token classification?

You should be able to do something like this:

config = AutoConfig.from_pretrained("bert-base-cased", num_labels=3)
model = AutoModel.from_pretrained("bert-base-cased", config=config)

Note that in your example you have three possible labels: with o, with p, and with neither. If you set num_labels to 2, you will have gotten the error that you described.

-100 is the default ignore index for NLLLoss. When a target item has this index, it will be ignored from loss computation.

2 Likes