I’m working on project that requires me to deploy a locally store model on an air-gapped server. I’m using the TGI docker container (with podman). I’m used to Ollama, so initially attempted to load the local model from a GGUF, but looks like that’s not the case with TGI.
The TGI documentation simply says the --model-id flag should use the directory where a model saved using save_pretrained(...) method. But what is that format?
I just want to use and off-the-shelf pretrained model, in this case Mixtral 8x 7B.
Thank you for the guidance! My remaining question is: what should the local data directory contain, in terms of weights format? Should it just be a copy of the hf repo for a given model distro?