Hello I am getting this error when trying to load the tokenizer for camembert model.
File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/MODULE_FMC/scriptTraitements/classifying.py", line 323, in <module>
ori, pro, name, tokens, t = preprare_data(test_file, m)
File "/gpfs7kw/linkhome/rech/genlig01/umg16uw/test/MODULE_FMC/scriptTraitements/classifying.py", line 93, in preprare_data
tokenizer = CamembertTokenizer.from_pretrained(path_to_tokenizer_files)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1719, in from_pretrained
return cls._from_pretrained(
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1792, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/transformers/models/camembert/tokenization_camembert.py", line 145, in __init__
self.sp_model.Load(str(vocab_file))
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/sentencepiece/__init__.py", line 367, in Load
return self.LoadFromFile(model_file)
File "/linkhome/rech/genlig01/umg16uw/.conda/envs/bert/lib/python3.9/site-packages/sentencepiece/__init__.py", line 171, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: /home/conda/feedstock_root/build_artifacts/sentencepiece_1612846325144/work/src/sentencepiece_processor.cc(848) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: /home/conda/feedstock_root/build_artifacts/sentencepiece_1612846325144/work/src/sentencepiece_processor.cc(848) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
I do not know how to solve it ?