I’m using some A10G instances to run a 20GB model in a private Space, and I’ve got the Space set to shut down after 15 minutes of no use to save $.
Each cold-start though takes a few minutes to re-download all model weights from the hub, which is a bit of a pain to wait for and I’m sure an annoying amount of bandwidth for huggingface.
Is there any method of caching (or similar) these model weights, or a recommended way to load/store them that doesn’t require the space to reload them each time?
A minimal example could be, using the transformers cli:
FROM python:3.8-slim-buster
# Set up a new user named "user" with user ID 1000
RUN useradd -m -u 1000 user
# Switch to the "user" user
USER user
# Set home to the user's home directory
RUN mkdir -p $HOME/app
ENV HOME=/home/user \
PATH=/home/user/.local/bin:$PATH
# Set the working directory to the user's home directory
WORKDIR $HOME/app
# install transformers cli
RUN pip install transformers
# Download model weights
RUN ["transformers-cli", "download", "distilbert-base-cased-distilled-squad"]