Hello!
I’m working on project that requires me to deploy a locally store model on an air-gapped server. I’m using the TGI docker container (with podman). I’m used to Ollama, so initially attempted to load the local model from a GGUF, but looks like that’s not the case with TGI.
The TGI documentation simply says the --model-id
flag should use the directory where a model saved using save_pretrained(...)
method. But what is that format?
I just want to use and off-the-shelf pretrained model, in this case Mixtral 8x 7B.
Thanks!