What is the correct form of decoder_input_ids for LEDForConditionalGeneration?

user123123 · July 3, 2021, 5:03am

I have been looking into the articles on the web, but unfortunately I cannot find the clear answer. I guess one of them is the correct decoder_input_ids (label should be decoder_input_ids[1:] ?):

1) <s>...</s><pad>...<pad>
2) </s><s>...</s><pad>...<pad>
3) <pad><pad>...</s><s>...</s>

Thanks in advance.

+) I am going to fine-tune this model for free form QA.

user123123 · July 5, 2021, 2:32am

I guess decoder_input_ids should be </s><s>... (without </s>), given label as <s>...</s>, according to the code below used for generating decoder_input_ids:

def shift_tokens_right(input_ids: torch.Tensor, pad_token_id: int, decoder_start_token_id: int):
    """
    Shift input ids one token to the right.
    """
    shifted_input_ids = input_ids.new_zeros(input_ids.shape)
    shifted_input_ids[:, 1:] = input_ids[:, :-1].clone()
    shifted_input_ids[:, 0] = decoder_start_token_id

    assert pad_token_id is not None, "config.pad_token_id has to be defined."
    # replace possible -100 values in labels by `pad_token_id`
    shifted_input_ids.masked_fill_(shifted_input_ids == -100, pad_token_id)

    return shifted_input_ids

Topic		Replies	Views
How does T5 create the correct decoder_input_ids? 🤗Transformers	2	2664	September 21, 2020
The meaning of 'decoder input ids' in encoder-decoder model Beginners	1	2377	July 29, 2022
Manually generate generate_ids using BlipForConditionalGeneration Models	0	137	April 21, 2024
What should decoder_input_ids be when pre-training mBART? Models	0	10	June 18, 2025
Training BART, error when preparing decoder_input_ids. Shape of input_ids? Beginners	3	1454	August 7, 2020

What is the correct form of decoder_input_ids for LEDForConditionalGeneration?

Related topics