Inference API stopped working for my model

I have a pinned model (shaxpir/prosecraft_resumed_ft2) on the inference API that has been working well for over a year, but it recently stopped working… When I make a request like this:

curl -i -X POST https://api-inference.huggingface.co/models/shaxpir/prosecraft_resumed_ft2 \
     -H "Authorization: Bearer <REDACTED>" \
     -H "Content-Type: application/json" \
     -d \
     '{
          "inputs":"Once upon a time,",
          "options":{
            "use_gpu": true,
            "use_cache": false
          },
          "parameters": {
            "return_full_text": false,
            "num_return_sequences": 1,
            "temperature": 1.0,
            "top_p" : 0.9,
            "max_new_tokens": 250
          }
     }'

I get a 503 error telling me that the model is currently loading…

HTTP/2 503
date: Mon, 24 Apr 2023 19:17:42 GMT
content-type: application/json
content-length: 91
x-request-id: JcowSHvjgHSgCiowln3Zm
access-control-allow-credentials: true
vary: Origin, Access-Control-Request-Method, Access-Control-Request-Headers

{
  "error" : "Model shaxpir/prosecraft_resumed_ft2 is currently loading",
  "estimated_time" : 20.0
}

But the model never seems to fully load, and the “estimated_time” never changes.

Can you help?