Hi, recently all my pre-trained models undergo this error while loading their tokenizer:
Couldn't instantiate the backend tokenizer from one of: (1) a tokenizers library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
I tried to pip install sentencepiece but this does not solve the problem. Do you know any solution? (I am working on Google Colab)
Note: In my humble opinion, changing so important things so fast can generate very dangerous problems. All my students (I teach DL stuff) and clients are stuck on my notebooks. I can understand that after a year a code can become outdated, but not just after two months. This requires a lot of maintenance work from my side!
Hi @denocris I am facing the same problem. I am a student and I depend on this for my presentation tomorrow. I am also using Google Colab. Can you please explain in steps how to fix this sentencpiece problem? Thanks in advance
Sorry @Ogayo, I have just read. I am happy you solved it. It is quite annoying to have these kind of issues from one day to another. I had this error during a live presentation. A couple of days before the notebook was working well!
Hi, Iāve faced the same issue while integrating it into a calculator-based website. Installing transformers with sentencepiece and restarting the runtime worked for me. Try this:
āpip install transformers sentencepieceā