API access no longer working despite Pro subscription

It seems as if I no longer have access to the Llama 2-70b model through the API.

I have HF Pro subscription, and have been running code fine up until 20 minutes ago. I stopped my script from sending anymore queries to amend some parameter, and sent it out again only to receive the following error as a response:

{‘error’: ‘The model meta-llama/Llama-2-70b-chat-hf is too large to be loaded automatically (137GB > 10GB). Please use Spaces (Spaces - Hugging Face) or Inference Endpoints (Inference Endpoints - Hugging Face).’}

I created a new API token, but the issue persists.

To reiterate, I have an active and valid Pro subscription. If I can’t use the API, what am I paying for?

1 Like

I am having the exact same issue.

Edited to say:
They may have taken down llama2 70b from the inference API based on other forum answers I’ve read. Because specifically c4ai-command-r-plus, larger than llama2 70b is working.

I’m having the same problem.
I contacted them and they answered me that they have “temporarily removed meta-llama/Llama-2-70b-chat-hf but it will be back to use with the Inference API soon, though no ETA just yet.”
So we just need to wait.

(Also, I think that this holds for any other Llama2 version since they are all unavailable at the moment)

Hi Karl,

Thanks for the response, it does seem as if there is no longer support for Llama 2. Would you mind sharing with me your code for the API query for the cohere model?

For some reason my code is failing to find the correct tokenizer…

Cheers

I guess it’s working now. I had same issue yesterday, but now it seems to work fine.

1 Like

Sorry, folks, meta-llama/Llama-2-70b-chat-hf/ is back online after temporarily going down.

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.