Issues with save_pretrained (MarianMT)

Hi all,

I’m a beginner with Huggingface and am currently trying to wrap my head around some behaviour with the transformers library. This is a duplicate of a post I made in the transformers forum last week with no feedback - it might be so basic no one had an interest!

I’m hoping to set up an internal translation function at work, and I’m experimenting with MarianMT. I have what seems to me to be strange behaviour where everything works if I load the model, but if I save it and then load it from the saves it no longer works, and instead produces a TypeError with forward().

This code works and translates the example phrases:

from transformers import MarianMTModel, MarianTokenizer

model_name = "Helsinki-NLP/opus-mt-fr-en"
tokenizer = MarianTokenizer.from_pretrained(model_name)
tokenizer.save_pretrained("./opus_fr_en")
model = MarianMTModel.from_pretrained(model_name)
model.save_pretrained("./opus_fr_en")

#test phrases from wikipedia's French ML article
src_text = [
    "L'apprentissage automatique comporte généralement deux phases.",
    "Depuis l'antiquité, le sujet des machines pensantes préoccupe les esprits."
]
translated = model.generate(**tokenizer(src_text, return_tensors="pt", padding=True))
[tokenizer.decode(t, skip_special_tokens=True) for t in translated] 

This produces the expected output,

['Machine learning usually consists of two phases.',
 'Since ancient times, the subject of thinking machines has been a matter of concern to minds.']

However, if I try loading the model and tokenizer that I saved and running it the same way, it fails:

#Load model saved in previous example
model_fr_en = MarianMTModel.from_pretrained("./opus_fr_en")
tokenizer_fr_en=MarianMTModel.from_pretrained("./opus_fr_en")

translated2 = model_fr_en.generate(**tokenizer_fr_en(src_text, return_tensors="pt", padding=True))
[tokenizer_fr_en.decode(t, skip_special_tokens=True) for t in translated2]

This gives the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 translated2 = model_fr_en.generate(**tokenizer_fr_en(src_text, return_tensors="pt", padding=True))
      2 [tokenizer_fr_en.decode(t, skip_special_tokens=True) for t in translated2]

File /opt/anaconda3/envs/transformers/lib/python3.9/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'return_tensors'

I’ve determined the specific issue seems to be with the tokenizer. If I call the original tokenizer on a text sample, I get the following output:

tokenizer(src_text[0])

#output:

{'input_ids': [87, 6, 5376, 8810, 6365, 2950, 203, 8335, 3, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

However, if I call the tokenizer that’s been saved then loaded back, I get an attribute error:

tokenizer_fr_en(src_text[0])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[6], line 1
----> 1 tokenizer_fr_en(src_text[0])

File /opt/anaconda3/envs/transformers/lib/python3.9/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []
...
File /opt/anaconda3/envs/transformers/lib/python3.9/site-packages/transformers/models/marian/modeling_marian.py:741, in MarianEncoder.forward(self, input_ids, attention_mask, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict)
    739     raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
    740 elif input_ids is not None:
--> 741     input_shape = input_ids.size()
    742     input_ids = input_ids.view(-1, input_shape[-1])
    743 elif inputs_embeds is not None:

AttributeError: 'str' object has no attribute 'size'

I’m calling the functions the same way (which was copied from an example in the MarianMT documentation), so I’m at a bit of a loss why it fails when loaded from a specific save like this. I would appreciate any help or insight!

Update for anyone who stumbles on this: This is just user error. It looks like I copy-pasted and neglected to modify the text properly:

tokenizer_fr_en=MarianMTModel.from_pretrained("./opus_fr_en")

Should be

tokenizer_fr_en=MarianTokenizer.from_pretrained("./opus_fr_en")