Importing tokenizers version >0.10.3 fails due to openssl

So, the only tokenizer version I can install is 0.10.3 or lower. Any version higher (0.11 or higher) runs into an libssl error (specifically libssl.so.3 does not exist). As far as I understand this is related to openssl1.1.1 being installed instead of 3. However, some packages installed by hugging faces (i.e. tensorflow) require openssl1.1.1 or lower, meaning that I cannot install openssl3.

Anybody run into similar problems and has a solution?

4 Likes

For me I solved the error by installing tokenizers using pip instead of conda

I’m having this exact problem, though it’s unable to locate openssl.so.10. If I create a symlink as described in this ask ubuntu page, I get another error:

File ~/anaconda3/lib/python3.9/site-packages/tokenizers/__init__.py:79, in <module>
     75     MERGED_WITH_NEXT = "merged_with_next"
     76     CONTIGUOUS = "contiguous"
---> 79 from .tokenizers import (
     80     Tokenizer,
     81     Encoding,
     82     AddedToken,
     83     Regex,
     84     NormalizedString,
     85     PreTokenizedString,
     86     Token,
     87 )
     88 from .tokenizers import decoders
     89 from .tokenizers import models

ImportError: /home/user/anaconda3/lib/python3.9/lib-dynload/../../libssl.so.1.1: version `libssl.so.10' not found (required by /home/user/anaconda3/lib/python3.9/site-packages/tokenizers/tokenizers.cpython-39-x86_64-linux-gnu.so)

For those who are looking for something that works, I just downgraded transformers to 4.16.2 (which is the latest version that supports tokenizers>0.10.3)

E.g. with conda:

conda install -c huggingface transformers==4.14.1 tokenizers==0.10.3