Meta-llama / Meta-Llama-3-70B-Instruct is not available as a serverless API

Hi,
The serverless API documentation states that llama3-70B is available via the serverless API inference access to Pro users, however, I get an error whenever I try to use it in my code. My credentials are OK because other models work just fine. Also, in the web page of the model, the inference API also doesn’t work with the following error:
The model meta-llama/Meta-Llama-3-70B-Instruct is too large to be loaded automatically (141GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).

What can be the problem?

1 Like

If you try some other Pro-only model, we can tell if it’s a repo problem by whether it works or not. non-Pro models don’t make much sense because they will work even if you make a mistake sending the token. Any other good Pro-only models you could try?

The problem is only with this specific model. Other Pro-only models work just fine (tried llama3-8B and llama3.1-70B)

1 Like

Thank you. If so, it’s definitely a repo configuration, content, or HF bug issue, and opening a Discussion is the fastest way to get the developers and repo admins to notice. Everyone involved in the organization will be notified. It’s the one that turns the icon on the home screen yellow. Everyone will notice immediately, except for those who didn’t read it in the first place.

If it wasn’t their fault, they would have taken it upon themselves to file a complaint with HF to resolve it.

Thanks! I’ll open a discussion.

1 Like

No, it hasn’t been working since yesterday evening, not even Llama 3.1 70B.

  const response = await hf.chatCompletion({
    model: "meta-llama/Llama-3.1-70B", // Use a valid LLaMA model ID
    // model: "mistralai/Mixtral-8x7B-Instruct-v0.1", // Use a valid LLaMA model ID
    messages: fullConversation,
    max_new_tokens: 15,
    temperature: 0.9,
    
  });

  const llamaMessage = {
    role: "user",
    content: response.choices[0].message.content.trim(),
  };

it hasn’t been working since yesterday evening,

That timing…maybe they made some changes to make this new HF feature work…?

any update??

They released this just in time. This one includes Llama-3.1-70B. Maybe something happened in the coordination for this, maybe it’s unrelated.

But still, so far I haven’t seen any response from the developer on Discussion.

Thanks for reporting, have pinged the team.

1 Like

Hi all! This should be fixed now. Sorry for the inconvenience.

1 Like