Inference endpoint, gated repo 401 error

RedFoxPanda · July 15, 2024, 3:08am

I’m using the inference endpoints website to host a fine tuned model. Initialization fails and tells me that I don’t have authorization to gated repos, yet I should have access.

I’ve tried mistral and llama fine tuned models. I’ve tried AWS and Google server options.

I do have an “access token” associated with my hugging face account so that I can pay for services like CPUs/GPUs, inference endpoint servers, etc.

meganariley · July 19, 2024, 6:53pm

Hi @RedFoxPanda In Inference Endpoints, you now have the ability to add an env variable to your endpoint, which is needed if you’re deploying a fine-tuned gated model like Meta-Llama-3-8B-Instruct.

We have some additional documentation on environment variables but the one you’d likely need is HF_TOKEN. You can add the HF_TOKEN as the key and your user access token as the value. User access tokens can be generated in the settings of your account.

Please let me know if you have additional questions!

RedFoxPanda · July 24, 2024, 5:36pm

This did work. I then had some type of error due to not having enough memory and I upgraded to a higher level gpu/cpu for the cloud computer. “Running” status is now present. Thanks.

meganariley · July 24, 2024, 10:03pm

@RedFoxPanda I’m glad to hear it! Thanks for letting me know.

system · July 25, 2024, 10:04am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem to deploy endpoint Inference Endpoints on the Hub	3	303	July 19, 2024
Inference Endpoints 401 Error Intermediate	2	385	July 15, 2024
When deploying AutoTrained model: "Cannot access gated repo" 🤗AutoTrain	1	686	May 1, 2024
Assistance with 401 Unauthorized Error for API Access Beginners	10	2153	December 4, 2024
Inference API - Gated Model - 401 Client Error Inference Endpoints on the Hub	0	464	May 28, 2024

Inference endpoint, gated repo 401 error

Related topics