Hello,
I want to deploy my model but I always get this error after +/- 20 minutes of “deployment”:
Endpoint encountered an error.
You can try restarting it using the “retry” button above. Check [ logs] for more details.
[Server message]Endpoint failed to start
Scheduling failure: unable to schedule
And in the logs I get this error:
Error 502 while fetching logs for "mon-modele-bricks-hiv":
Has this ever happened to anyone?
2 Likes
Hi @Albaninho10 Thank you for reporting! We’re investigating now.
1 Like
Hi @Albaninho10 Thank you for waiting! This error message is related to availability of the GPU instance at the time of deployment - this can be resolved by selecting a different instance or region if possible.
We’ve added updating this error message so that it’s clearer on the roadmap, though there’s no ETA just yet. Please let us know if you have any feedback about Inference Endpoints - we’re all ears!
I also wanted to mention our Model Catalog, which has ready-to-deploy models that require no additional customization and deployment is verified by Hugging Face.
Let us know if you have other questions.
1 Like
I’ve seen similar issues with deployment failures related to GPU availability. From what you’re describing, it seems like the GPU instance may not be available when the model tries to deploy, which causes the 502 error. One possible solution is to try selecting a different instance type or region during deployment to ensure that there are available GPU resources at the time of deployment. Also, double check if there’s any region specific resource limitation that might be causing the issue.
1 Like
Thanks for the reply, indeed by changing region and GPU the model is deployed correctly !
2 Likes