I wanted to ask if someone has tried the service provided by Huggingface “Transformers in production” (the link to this section would be Inference Endpoints - Hugging Face).
In my case I have been trying to deploy a model in order to test how reliable and fast is the platform. As I read at the documentation there are 4 phases to deploy a model:
“Building” > “Initialising” > “Running” (and “Failed” if the model did not deploy successfully)
However, in all cases the models that I have tried to deploy do not pass the phase of “Building”, moreover, I made this trials yesterday, so they have been in this state for 12 hours.
Therefore, I find strange that a model has been building for almost 12 hours and it has not failed (I am not able to see any logs in the console).
Has someone came across these problems? If so, how do you have solved them?
Thank you everyone for taking your time reading this question.
Can you please share more information about the endpoint you have created? including instance type, region and model repository?
Have you tried recreating your endpoint? The build should only take a few minutes, depending on the model size you are trying to create.
the endpoint properties are:
- Endpoint Type: public
- Instance Type: CPU • medium
- Provider: AWS • eu-west-1
- Task: text-generation
- Repository: gpt2
This endpoint has been created the day 3 of November, at 13:35 (CEST). In the past we created other Endpoints, with the intention of test this new feature (the Endpoints that I mentioned the first time), which where almost 1 week in the state of “Building”, however, we erased them to prevent any possible charge.
We will let this Endpoint in this state, as we did previously so you can check whatever you want.
@DanielCano could you please your user you used to deploy? We are not able to reproduce your error.
The user that we are using to deploy is M47 Labs (I attach a screenshot with the name in case of doubt)
Hello @philschmid We have observed that the Endpoint has successfully deployed, therefore this problem would be solved, thank you for the help in this matter.
We wanted to ask if there was any form of payment that was for each time that we call the Endpoint and not for time deployed, because most of the time our endpoints are not going to receive requests, having charges for having them lifted would always be quite inefficient.
Happy to hear that it worked!
We are working on improving the solution to be able to provide such features, but currently there is none. The only thing you could do is delete and create the endpoint based on some schedule.