AutoTokenizer.from_pretrained() suddenly raises an error

OK since this was an EnvironmentError I checked everything and I think I have found the culprit.
In my bashrc, I had export HF_HUB_ENABLE_HF_TRANSFER=1 set which means the problem might have something to do with an inconsistency with the hf-transfer package. Reading Hugging Face’s Environment Variable documentation gave the clue about the possibility of such incidents and undefined behavior

HF_HUB_ENABLE_HF_TRANSFER

Set to True to download files from the Hub using hf_transfer. It’s a Rust-based package that enables faster download (up to x2 speed-up). Be aware that this is still experimental so it might cause issues in your workflow. In particular, it does not support features such as progress bars, resume download, proxies or error handling.

Note: hf_transfer has to be installed separately from Pypi.

I have forced a reinstall and upgrade through pip and apparently that resolved the issues with both supercomputer and data center machines which had problems calling the AutoTokenizer.from_pretrained().

1 Like