I’m learning the tutorial Token classification.
And I have a question:
In preprocessing part, it converted the label of special tokens and the tokens that are not the first token in the word to -100.
It said that this will show its usefulness in loss function, but the code hasn’t use something like ignore_index
in loss function. The -100 seems only work in metric evaluation.
So I wonder if loss function in Trainer will automatically ignore -100?
Or it works in num_labels, id2label and label2id parameter? Does it mean that I set these parameters, other labels will be ignored?
I have the same question for GPT2LMHeadModel
.
Edit: -100
is the default ignore_index
in PyTorch’s CrossEntropyLoss. So, any token with a label of -100
will be ignored in loss computation.
2 Likes
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.