Inference Endpoints Issues

Since yesterday, we’ve been having issues with dedicated inference endpoints. We have two models continuously deployed on dedicated endpoints on an organization account.

Yesterday at 10.00 AM UTC the endpoints were suddenly paused, with the latest update attributed to “admin”. At the same time, the endpoints WebUI displayed the error: “There was an error while fetching your endpoints.”

I tried logging out and back in, at which point I was presented with an authorization screen, asking me to permit Inference Endpoints to access my (and the organization’s) HuggingFace account.

After granting access (~10.30 AM UTC) I was able to see the endpoints, and manually restart them. Both the WebUI and the Endpoints’ API continued to work fine yesterday.

The WebUI was again rendered inaccessible today (seen at ~8.20 AM UTC), displaying the “There was an error while fetching your endpoints.” error on the organization’s dedicated endpoints. The issue seems to have resolved at ~8.50. This time, the endpoints were not paused but continued to work through the WebUI outage.

Additionally, each time I log out and log in again I’m always asked to authorize Inference Endpoints’ access to HuggingFace account data. Not sure how relevant the authorization issue is to the WebUI errors, could be just a coincidence.

Hi @nikos-ir thanks for reporting this! We had recently applied a change to our infrastructure that unfortunately impacted Endpoints - we’re very sorry for this disruption. We’ve since applied a fix and are looking at further solutions to help mitigate impacts like these in the future. Let us know if you’re still running into issues!

@meganariley You are still far away from running stable, see Inference API down? - #11 by nielsr

Please, provide Feedback to the users when you are fixing things and when you think things are fixed. It is really hard to give you reproducible bug reports when the production stage changes within minutes. Do the bug fixing on a separate stage before pushing it to production.