Hello,
This is a question about how to use decoder_input_ids
with deepspeed.
I have a language generation task that has decoder_input_ids
. For example, I need the generation model to start the generation with ‘yes’.
My solution is that I pass the decoder_input_ids
to model.generate()
by inputs['decoder_input_ids']
within Seq2SeqTrainer
, is this the right way?
Edit: It doesn’t seem to work that way. I suspect that the trainer.py does not parse the decoder_input_ids
as an argument inside the model(*input)
. Any suggestions?