Will Trainer loss functions automatically ignore -100?

PolarisRisingWar · April 10, 2023, 1:01pm

I’m learning the tutorial Token classification.
And I have a question:
In preprocessing part, it converted the label of special tokens and the tokens that are not the first token in the word to -100.
It said that this will show its usefulness in loss function, but the code hasn’t use something like ignore_index in loss function. The -100 seems only work in metric evaluation.
So I wonder if loss function in Trainer will automatically ignore -100?
Or it works in num_labels, id2label and label2id parameter? Does it mean that I set these parameters, other labels will be ignored?

mdrpanwar · June 29, 2023, 10:54am

I have the same question for GPT2LMHeadModel.

Edit: -100 is the default ignore_index in PyTorch’s CrossEntropyLoss. So, any token with a label of -100 will be ignored in loss computation.

system · May 25, 2024, 6:17am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multi-label token classification: "-100" special label 🤗Transformers	1	504	September 18, 2023
Pad token vs -100 index_id Intermediate	2	46	April 1, 2025
Difference between setting label index to -100 & setting attention mask to 0 🤗Transformers	5	2970	March 17, 2021
UdopForConditionalGeneration ignore_index in loss calculation 🤗Transformers	0	104	March 28, 2024
Understanding ignore_index and reduce_labels Beginners	4	1385	August 1, 2024

Will Trainer loss functions automatically ignore -100?

Related topics