How can I load a locally store model with TGI and Docker?

Hello!

I’m working on project that requires me to deploy a locally store model on an air-gapped server. I’m using the TGI docker container (with podman). I’m used to Ollama, so initially attempted to load the local model from a GGUF, but looks like that’s not the case with TGI.

The TGI documentation simply says the --model-id flag should use the directory where a model saved using save_pretrained(...) method. But what is that format?

I just want to use and off-the-shelf pretrained model, in this case Mixtral 8x 7B.

Thanks!

Hi. I was looking for the same thing and came across your post.

According to this article, you can load locally saved models by mounting the volume in docker (using the argument -v).

I will test the solution and update you if it worked.

Yes that’s explained here (at the bottom): Non-core Model Serving