Does attention_mask refer to input_ids or to labels?

Philomath868 · June 18, 2025, 4:41pm

Thanks, that’s a clear and succinct explanation!

But I guess my question can still stand regarding decoder_input_ids, in case it’s based on labels (see my other question, which would mean - if I understand correctly - that labels (shifted right) are used during computation, at decoder side, no?

Topic		Replies	Views
The meaning of 'decoder input ids' in encoder-decoder model Beginners	1	2377	July 29, 2022
Decoder attention mask in text2text/se2seq generation encoder-decoder models 🤗Transformers	1	1638	March 22, 2022
What should decoder_input_ids be when pre-training mBART? Models	0	10	June 18, 2025
Wav2Vec2: Inner workings of the Trainer class Beginners	6	387	September 6, 2021
How to use `inputs_embed` and `attention_mask` together? Intermediate	1	934	May 19, 2024

Does attention_mask refer to input_ids or to labels?

Related topics