Hi,
To avoid re-downloading the models every time my docker container is started, I want to manually download the models during building the docker image. I wrote a small script that runs the following to download the models during the build:
AutoTokenizer.from_pretrained(configs.get("models_names.tockenizer"))
SentenceTransformer(configs.get("models_names.sentence_embedding"))
AutoModelForSeq2SeqLM.from_pretrained(configs.get("models_names.paraphraser"))
While this works in theory, it breaks in practice because the models are not just downloaded, they are also loaded into memory, and that raises an out of memory error on the build machine. I then read about snapshot_download
(ref) and replaced my script with
snapshot_download(configs.get("models_names.tockenizer"))
snapshot_download(configs.get("models_names.sentence_embedding"))
While these two lines do download the same files, transformers is not able to load the models and attempts to download them whenever a from_pretrained
call is made. It remains the case even if I explicitly TRANSFORMERS_CACHE
to point to the cache directory of HuggingFace hub.
I noticed that the structure of the caches generated by snapshot_download
is different from the cache structure of transformers using from_pretrained
. Is that why it’s not working?
In general, what is the best way to pre-download the models during the build phase?
Thanks.