How can I load a locally store model with TGI and Docker?

jshbmllr · June 27, 2024, 4:31pm

Hello!

I’m working on project that requires me to deploy a locally store model on an air-gapped server. I’m using the TGI docker container (with podman). I’m used to Ollama, so initially attempted to load the local model from a GGUF, but looks like that’s not the case with TGI.

The TGI documentation simply says the --model-id flag should use the directory where a model saved using save_pretrained(...) method. But what is that format?

I just want to use and off-the-shelf pretrained model, in this case Mixtral 8x 7B.

Thanks!

SMAntony · June 28, 2024, 4:49am

Hi. I was looking for the same thing and came across your post.

According to this article, you can load locally saved models by mounting the volume in docker (using the argument -v).

I will test the solution and update you if it worked.

nielsr · June 28, 2024, 1:08pm

Yes that’s explained here (at the bottom): Non-core Model Serving

jshbmllr · July 2, 2024, 1:53pm

Thank you for the guidance! My remaining question is: what should the local data directory contain, in terms of weights format? Should it just be a copy of the hf repo for a given model distro?

nielsr · July 6, 2024, 9:48am

Yes it should just be a copy

Topic		Replies	Views
Load pre-trained models inside containerized pipeline for multi-lingual translation Intermediate	0	699	November 16, 2022
Download models for local loading Beginners	11	94127	March 18, 2024
New to Docker model getting stuck Models	0	22	February 2, 2025
Loading a specific model configuration in TGI 🤗Transformers	0	115	July 15, 2024
Load local model with WhisperModel Models	0	325	June 21, 2024

How can I load a locally store model with TGI and Docker?

Related topics