Problems with "Transformers in production" service

DanielCano · October 20, 2022, 8:17am

Hi everyone,

I wanted to ask if someone has tried the service provided by Huggingface “Transformers in production” (the link to this section would be Inference Endpoints - Hugging Face).

In my case I have been trying to deploy a model in order to test how reliable and fast is the platform. As I read at the documentation there are 4 phases to deploy a model:

“Building” > “Initialising” > “Running” (and “Failed” if the model did not deploy successfully)

However, in all cases the models that I have tried to deploy do not pass the phase of “Building”, moreover, I made this trials yesterday, so they have been in this state for 12 hours.

Therefore, I find strange that a model has been building for almost 12 hours and it has not failed (I am not able to see any logs in the console).

Has someone came across these problems? If so, how do you have solved them?

Thank you everyone for taking your time reading this question.

philschmid · October 20, 2022, 11:32am

Hello @DanielCano,

Can you please share more information about the endpoint you have created? including instance type, region and model repository?

Have you tried recreating your endpoint? The build should only take a few minutes, depending on the model size you are trying to create.

DanielCano · November 3, 2022, 12:56pm

Hello @philschmid,

the endpoint properties are:

Endpoint Type: public
Instance Type: CPU • medium
Provider: AWS • eu-west-1
Task: text-generation
Repository: gpt2

This endpoint has been created the day 3 of November, at 13:35 (CEST). In the past we created other Endpoints, with the intention of test this new feature (the Endpoints that I mentioned the first time), which where almost 1 week in the state of “Building”, however, we erased them to prevent any possible charge.

We will let this Endpoint in this state, as we did previously so you can check whatever you want.

philschmid · November 3, 2022, 1:59pm

@DanielCano could you please your user you used to deploy? We are not able to reproduce your error.

DanielCano · November 3, 2022, 2:15pm

The user that we are using to deploy is M47 Labs (I attach a screenshot with the name in case of doubt)

DanielCano · November 3, 2022, 3:21pm

Hello @philschmid We have observed that the Endpoint has successfully deployed, therefore this problem would be solved, thank you for the help in this matter.

We wanted to ask if there was any form of payment that was for each time that we call the Endpoint and not for time deployed, because most of the time our endpoints are not going to receive requests, having charges for having them lifted would always be quite inefficient.

philschmid · November 7, 2022, 8:53am

Hello @DanielCano,

Happy to hear that it worked!

We are working on improving the solution to be able to provide such features, but currently there is none. The only thing you could do is delete and create the endpoint based on some schedule.

Topic		Replies	Views
Error Deploying Private Endpoint Inference Endpoints on the Hub	2	304	October 23, 2023
Problem to deploy endpoint Inference Endpoints on the Hub	3	303	July 19, 2024
Deploying Sentence Transformer as sagemaker endpoint Amazon SageMaker	18	8169	March 26, 2024
How do Inference Endpoints fit into larger solution? Inference Endpoints on the Hub	0	412	June 17, 2023
Stuck starting inference model Inference Endpoints on the Hub	7	2271	July 14, 2025

Problems with "Transformers in production" service

Related topics