Generating decoder input ids during inference for opus-mt


Goal: I am trying to run an inference on the tensorflow lite variant of the opus-mt (en-hi) model.

Question: I just wanted to confirm why is the model supplied decoder_inputs_ids during inference and why are they, if I have correctly understood their assignment, being assigned with shifted input id’s. Given that during training - I will be supplying the target_sequences through this parameter, will this not lead to a kind of inferential inconsistency in the model’s operation ?

Context: During inference, the interpretor requires 3 inputs - the input_ids, decoder_input_ids and the attention_masks that I have to provide at the edge side. From my understanding, HF’s generate() function already handles the generation of two of the optional inputs i.e decoder_input_ids and the attention masks. In my case, I need to supply them mandatorily. I looked through this to get an idea about their generation and inferred them to be -

  1. decoder_input_ids set as inputs_ids shifted by the pad_token_id ( as specified in HF’s Marian MT Documentation )
  2. attention_masks as a default set attention mask ( naively a numpy.ones(<batch_len, max_seq_length>))

Note: I have been able to successfully generate an output using these assignments for the tflite interpretor - So this isn’t particularly a syntactic issue.

Thanks in advance for any help!