I’m using the inference endpoints website to host a fine tuned model. Initialization fails and tells me that I don’t have authorization to gated repos, yet I should have access.
Hi @RedFoxPanda In Inference Endpoints, you now have the ability to add an env variable to your endpoint, which is needed if you’re deploying a fine-tuned gated model like Meta-Llama-3-8B-Instruct.
We have some additional documentation on environment variables but the one you’d likely need is HF_TOKEN. You can add the HF_TOKEN as the key and your user access token as the value. User access tokens can be generated in the settings of your account.
Please let me know if you have additional questions!
This did work. I then had some type of error due to not having enough memory and I upgraded to a higher level gpu/cpu for the cloud computer. “Running” status is now present. Thanks.