Or do we even have to pass decoder_input_ids
anymore???
Looking at this example for MT5, it looks like hte answer is “no” …
from transformers import MT5ForConditionalGeneration, T5Tokenizer
model = MT5ForConditionalGeneration.from_pretrained("google/mt5-small")
tokenizer = T5Tokenizer.from_pretrained("google/mt5-small")
article = "UN Offizier sagt, dass weiter verhandelt werden muss in Syrien."
summary = "Weiter Verhandlung in Syrien."
batch = tokenizer.prepare_seq2seq_batch(src_texts=[article], tgt_texts=[summary], return_tensors="pt")
outputs = model(**batch)
loss = outputs.loss
This sure would make it easier if all we have to pass in are the “labels” and not have to deal with the decoder_input_ids
ourselves when working within ConditionalGeneration models. Please lmk either way.
Thanks