Hi All,
I’m working on a Translation model using EncoderDecoderModel.
Each batch contains samples only from one language.
How can we define decoder_start_token_id
per each batch?
Sometimes, decoder_start_token_id
is <ENG>
and sometimes it’s <FRE>
I use model.module.config.decoder_start_token_id
to change decoder_start_token_id
per each batch. But this is causing lots of inconsistency when using Multiple GPUs and DeepSpeed.
Any suggestions?
It seems forward
function does not support decoder_start_token_id
. Only generate
function supports it. However, I need to specify decoder_start_token_id
per each batch during training.
Any suggestions?
Thank you