Looking for some guidance on handling Bitext Alignment. Similar to how Microsoft handles it with their Translation Service. For reference I’m using the Helsinki pre-trained models and have come across papers say alignment can be derived from the hidden states or the decoder attentions. I am returning them but can’t make any sense of the returned tensors. Looking for some documentation or examples on how to make sense of the decoder_attentions.
generated = translation_model.generate(return_dict_in_generate = True, **prepare translation_model = MarianMTModel.from_pretrained('Helsinki-NLP/opus-mt-en-es', return_dict=True, output_attentions=True, output_scores=True, output_hidden_states=True)