Model does not exist, inference API don't work

Hello!

I have started developing LLM style models, and honestly, things were going well, and had this one working a couple of weeks ago and my friends tried it successfully.

For some reason, now I can neither use my space or the inference provider, getting the following error “Server amusktweewt/tiny-model-500M-chat-v2 does not seem to support chat completion. Error: Model amusktweewt/tiny-model-500M-chat-v2 does not exist”.

I don’t know what happens because I changed nothing, literally the repo is frozen around a month ago and during that time it worked well, the model also works fine locally with a pipeline.

Thank you all for your time!

1 Like

Seems token issue or under maintain.

HF_TOKEN = "hf_my_valid_pro_token"
#HF_TOKEN = False # if use it, fails with 503 error

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=HF_TOKEN
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="amusktweewt/tiny-model-500M-chat-v2", 
    messages=messages, 
    max_tokens=500,
)

print(completion.choices[0].message)
# ChatCompletionOutputMessage(role='assistant', content='OUP for France - reduced price comparison board (BUFF) is the payoff for carbon emissions.', tool_calls=None)

Hi! We’re taking a closer look into this and I’ll update you soon. Thanks for reporting!

1 Like

Hi @amusktweewt thanks again for reporting. This is now fixed! Let us know if you continue running into issues.

2 Likes

Thanks! it works perfectly now, both the space and the Inference API

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.