Need help as I am trying for deployment but not luck

Hello Hugging Face Support,

We are deploying vLLM in Kubernetes and trying to access the gated model mistralai/Mistral-7B-Instruct-v0.1.

  • Our account is granted access (the model page confirms this).
  • We created a fine-grained token with “Read access to contents of all public gated repos you can access”.
  • The token works perfectly with the Python client and CLI on a VM.
  • The token is correctly injected into our pod as HUGGINGFACE_HUB_TOKEN (we verified this).
  • But inside the pod, the model download fails with a 401 Unauthorized error.

We have restarted pods, updated secrets, and confirmed the environment variable is correct.

This appears to be a backend issue with token authentication for gated models in Kubernetes.

Could you please investigate or advise?

Thank you,
Ashutosh Kumar (wipro-gcp-ashu)

1 Like

HUGGINGFACE_HUB_TOKEN

There are several possible causes, but I think this is the most likely one. Currently, it is common to use HF_TOKEN.

Deprecated Variable Replacement
HUGGINGFACE_HUB_CACHE HF_HUB_CACHE
HUGGINGFACE_ASSETS_CACHE HF_ASSETS_CACHE
HUGGING_FACE_HUB_TOKEN HF_TOKEN
HUGGINGFACE_HUB_VERBOSITY HF_HUB_VERBOSITY