My server’s proxy does not allow me to go to Hugging Face. So, I downloaded Mistral 7B weights from GitHub to another computer, sftp
d it over to the server, and untarred the contents,
$ tar -tvf mistral-7B-Instruct-v0.3.tar
-rw-rw---- nobody/nogroup 14496078512 2024-05-09 10:47 consolidated.safetensors
-rwxrwxrwx nobody/nogroup 202 2024-05-20 07:09 params.json
-rwxrwxrwx nobody/nogroup 587404 2024-05-20 07:09 tokenizer.model.v3
, to $HF_HOME/hub/models--mistralai-Mistral-7B-Instruct-v0.3/
. However, when I
docker run -d
--name=tgi-mistral-7b
--env HF_HUB_OFFLINE=1
--env HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN
--env http_proxy=$http_proxy
--env https_proxy=$https_proxy
--env MAX_BATCH_TOTAL_TOKENS=32000
--env MAX_BATCH_PREFILL_TOKENS=16000
--env MAX_TOTAL_TOKENS=32000
--gpus all
--shm-size 1g
-p 8080:80
-v $volume:/artifactory.my_company.com/ghcr.io/huggingface/text-generation-inference:1.4.5
--model-id mistralai/Mistral-7B-Instruct-v0.3
, I get
huggingface_hub.utils._errors.EntryNotFoundError: No .bin weights found for model mistralai/Mistral-7B-Instruct-v0.3 and revision None.
How do I place/structure these weights/files so that TGI can reach them?