How can I load a locally store model with TGI and Docker?

jshbmllr · June 27, 2024, 4:31pm

Hello!

I’m working on project that requires me to deploy a locally store model on an air-gapped server. I’m using the TGI docker container (with podman). I’m used to Ollama, so initially attempted to load the local model from a GGUF, but looks like that’s not the case with TGI.

The TGI documentation simply says the --model-id flag should use the directory where a model saved using save_pretrained(...) method. But what is that format?

I just want to use and off-the-shelf pretrained model, in this case Mixtral 8x 7B.

Thanks!

SMAntony · June 28, 2024, 4:49am

Hi. I was looking for the same thing and came across your post.

According to this article, you can load locally saved models by mounting the volume in docker (using the argument -v).

I will test the solution and update you if it worked.

nielsr · June 28, 2024, 1:08pm

Yes that’s explained here (at the bottom): Non-core Model Serving

jshbmllr · July 2, 2024, 1:53pm

Thank you for the guidance! My remaining question is: what should the local data directory contain, in terms of weights format? Should it just be a copy of the hf repo for a given model distro?

nielsr · July 6, 2024, 9:48am

Yes it should just be a copy

Topic		Replies	Views
TGI does not reference model weights Beginners	0	118	June 16, 2024
How to download a model and run it with Ollama locally? Beginners	17	120577	May 15, 2025
Download models for local loading Beginners	11	95474	March 18, 2024
Load pre-trained models inside containerized pipeline for multi-lingual translation Intermediate	0	716	November 16, 2022
Getting error while loading model from local path : Exception: expected value at line 1 column 1 🤗Transformers	2	1196	August 20, 2024

How can I load a locally store model with TGI and Docker?

Related topics