What is 'model is currently loading;'

Hi.
I’m a beginner at NLP.
I’m going to summarize sentences using T5 model’s information API.
The model is currently loading keeps popping up and does not proceed.
Can you tell me what this error is?

also
I want to summarize more than 5,000 characters into 1,000 to 2,000 characters.
How should I write the parameters?

2 Likes

@Doogie
Hello :hugs:
Inference API loads models on demand, so if it’s your first time using it in a while it will load the model first, then you can try to send the request again in couple of seconds.

2 Likes

As per Detailed parameters — Api inference documentation, you can use the wait_for_model option to wait for the response instead of having to do multiple requests.

1 Like

Thank you for answering.

I hope this model is always ready.
What should I do to do that?
Model should always show me response immediately.

Can i pinned the model?

Thank you for answering!

i hope this model is always ready
wait_for_model parameters means ready to model?

wait_for_model is documented in the link shared above.

If false, you will get a 503 when it’s loading. If true, your process will hang waiting for the response, which might take a bit while the model is loading. You can pin models for instant loading (see Hugging Face – Pricing)

2 Likes

I get a message that wait_for_model is no longer valid

{‘inputs’: {‘past_user_inputs’: , ‘generated_responses’: , ‘text’: ‘yo’}, ‘parameters’: {‘min_length’: 1, ‘max_length’: 500, ‘repetition_penalty’: 50.0, ‘temperature’: 50.0, ‘use_cache’: True, ‘wait_for_model’: True}}
{“error”: “The following model_kwargs are not used by the model: [‘wait_for_model’] (note: typos in the generate arguments will also show up in this list)”}

2 Likes

Hi. With “togethercomputer/GPT-NeoXT-Chat-Base-20B” I’m using the “wait_for_model” parameter set to true, but I still have the “Model is currently loading”. Is it because the model is too big ?

2 Likes

I’ve set the wait_for_model parameter to True in the payload in the same way as @deseipel and it doesn’t work for me either. I don’t get a specific error about the request, I just get the usual 503 error in response: “Model is currently loading”.

I have finally managed to use the flag - the “parameters” dictionary actually needs to be called the “options” dictionary according to the documentation https://huggingface.co/docs/api-inference/detailed_parameters

However, after overcoming that error and duly waiting for a response, I waited, and got a 504 error instead: Gateway Timeout. Is HF down today?

1 Like

It’s not that the ‘parameters’ dictionary needs to be called ‘options’; rather, there is a separate argument from ‘parameters’ called ‘options’ that specifically takes the ‘use_cache’ and ‘wait_for_model’ keys in the dict.