Manually Downloading Models in docker build with snapshot_download

mostafa-samir · June 26, 2022, 6:20pm

Hi,

To avoid re-downloading the models every time my docker container is started, I want to manually download the models during building the docker image. I wrote a small script that runs the following to download the models during the build:

    AutoTokenizer.from_pretrained(configs.get("models_names.tockenizer"))
    SentenceTransformer(configs.get("models_names.sentence_embedding"))
    AutoModelForSeq2SeqLM.from_pretrained(configs.get("models_names.paraphraser"))

While this works in theory, it breaks in practice because the models are not just downloaded, they are also loaded into memory, and that raises an out of memory error on the build machine. I then read about snapshot_download (ref) and replaced my script with

    snapshot_download(configs.get("models_names.tockenizer"))
    snapshot_download(configs.get("models_names.sentence_embedding"))

While these two lines do download the same files, transformers is not able to load the models and attempts to download them whenever a from_pretrained call is made. It remains the case even if I explicitly TRANSFORMERS_CACHE to point to the cache directory of HuggingFace hub.

I noticed that the structure of the caches generated by snapshot_download is different from the cache structure of transformers using from_pretrained. Is that why it’s not working?

In general, what is the best way to pre-download the models during the build phase?

Thanks.

vblagoje · July 27, 2022, 8:41am

@mostafa-samir I worked on caching models in docker images, and what I think you are missing in your approach is committing the cached models to a new image. See How to Commit Changes to a Docker Image (With Example) for an example. HTH, Vladimir

lefnire · December 5, 2022, 2:14am

@mostafa-samir git-lfs clone the model into the dockerfile as a buildstage. The copy --from in the next stage discards the intermediate steps form the prior stage. Here’s mine for an AWS Lambda function:

FROM public.ecr.aws/lambda/python:3.9 AS model
RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | bash
RUN yum install git-lfs -y
RUN git lfs install
RUN git clone https://huggingface.co/ccdv/lsg-bart-base-4096-wcep /tmp/model
RUN rm -rf /tmp/model/.git

FROM public.ecr.aws/lambda/python:3.9
ARG FUNCTION_DIR="/var/task"
RUN mkdir -p ${FUNCTION_DIR}
COPY summarize.py ${FUNCTION_DIR}
COPY --from=model /tmp/model ${FUNCTION_DIR}/model
RUN pip install --no-cache-dir transformers[torch]==4.21.2
CMD [ "summarize.main" ]

Topic		Replies	Views
Load pre-trained models inside containerized pipeline for multi-lingual translation Intermediate	0	702	November 16, 2022
Downloading a model from the hub without loading it 🤗Transformers	6	3789	May 5, 2025
How to disable caching in .from_pretrained() 🤗Transformers	3	899	September 2, 2024
Download most used models in container and load them when necessary 🤗Hub	0	1285	September 29, 2023
How to add model repo's snapshots to the Hugging Face cache? 🤗Transformers	1	186	September 25, 2024

Manually Downloading Models in docker build with snapshot_download

Related topics