Hello @julien-c!
Last week, I uploaded a private Text Generation model to my Huggingface account…
https://huggingface.co/shaxpir/prosecraft_linear_43195/
And then I enabled pinning on that model in our account here:
https://api-inference.huggingface.co/dashboard/pinned_models
But when I try to execute an API call on this model, I always get an error message.
The API call looks like this…
curl -X POST https://api-inference.huggingface.co/models/shaxpir/prosecraft_linear_43195 \
-H "Authorization: Bearer <<REDACTED>>" \
-H "Content-Type: application/json" \
-d \
'{
"inputs":"Once upon a time, there was a grumpy old toad who",
"options":{"wait_for_model":true},
"parameters": {"max_length": 500}
}'
And the error is:
{"error":"We waited for too long for model shaxpir/prosecraft_linear_43195 to load. Please retry later or contact us. For very large models, this might be expected."}
I’ve been trying repeatedly, and waiting long intervals, but I still get this error every time.
It is quite a large model, but there are other larger models on public model cards that don’t seem to suffer from this problem. And I don’t see any documentation about model-size limitations for pinned private models (on CPU or GPU). Is there any guidance on that topic? Or is there anything that the support team can do to help me get un-stuck?
(Also, the “Pricing” page says that paid “Lab” plans come with email support, but the email address doesn’t seem to be published anywhere… I tried emailing api-enterprise@huggingface.co
but got no response for 9 days. And the obvious support@huggingface.co
bounced back to me… Can you let me know where to send support emails?)
Thank you so much!!