Cannot download translation models in Colab

I am trying to translate English text to German. And so I run this-

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")

But I get thrown an error-

ValueError: This tokenizer cannot be instantiated. Please make sure you have sentencepiece installed in order to use this tokenizer.

Full error message
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-65-accbe9f8763e> in <module>()
----> 1 translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")

1 frames
/usr/local/lib/python3.7/dist-packages/transformers/pipelines/__init__.py in pipeline(task, model, config, tokenizer, feature_extractor, framework, revision, use_fast, use_auth_token, model_kwargs, **kwargs)
    441 
    442             tokenizer = AutoTokenizer.from_pretrained(
--> 443                 tokenizer_identifier, revision=revision, use_fast=use_fast, _from_pipeline=task, **tokenizer_kwargs
    444             )
    445 

/usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    449                 else:
    450                     raise ValueError(
--> 451                         "This tokenizer cannot be instantiated. Please make sure you have `sentencepiece` installed "
    452                         "in order to use this tokenizer."
    453                     )

ValueError: This tokenizer cannot be instantiated. Please make sure you have `sentencepiece` installed in order to use this tokenizer.

As it is suggested that I should have sentencepiece installed, I installed it via pip, but that does not help. I have tried importing it so that its namespace is available, but it still does not work.

Note: Besides the Helsinki-NLP/opus-mt-en-de model, I have also tried using the Helsinki-NLP/opus-mt-fr-en model as shown in the course video, but it does not work either.

What am I missing?

okay, I tried to run this locally (not in Colab):

from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")

translation = translator("hello, my name is Bob")

print(translation)

and it printed out:

[{'translation_text': 'Hallo, mein Name ist Bob.'}]

I don’t know where you try run that code, but seems to work ok for me. Have you installed latest package of transformers?

pip install transformers -U

Where do you run that code?

Can you share the full code to see if something else is going on there?

Did you restart the kernel after installing sentencepiece?

Yes make sure to run the latest version of the notebooks (the first cell should be ! pip install datasets transformers[sentencepiece]).

2 Likes

This works. Thanks. :slight_smile:

I had created a new empty Notebook and was working on that. It wasn’t clear to me that I should only use Notebooks that appear if I click the “Open in Colab” button on the course pages.

Thanks for clarifying.