Is there a way to return the "decoder_input_ids" from "tokenizer.prepare_seq2seq_batch"?

Are the target tokens (the labels) replaced with the ignore token id somewhere as well? Doesn’t look like it from what I can see … so I’m assuming we need to do that ourselves, and pass the label ids with padding tokens set to -100.

Also, the decoder_input_ids come back in the form of <eos> <bos> X ..., but my understanding was always that it should start with <bos> and the labels shifted so that <bos> attempts to predict X[0] and so forth.