Guidelines for using a Custom Docker Image

Hi!

I am trying to use Inference Endpoints for model deployment in production. I have already succeed in deploying a model using the Default Container type. This means that I already have all the necesary setup including a custom Handler.py working.

The issue is that, the necessary final step involves installing a given package that is private. This means that it will not be properly installed from the requirements.txt. I saw that one can indicate a custom Docker Image which will solve easily this issue. However, I have been triying and never been able to initialize the Endpoint. It does not even show any logs.

What should be the content of the Dockerfile? I was assuming something like this:

FROM <my_docker_image>

# WORKDIR
WORKDIR /repository
ADD . /repository

# EXPOSE PORT XXX
EXPOSE XXX

CMD ["python", "handler.py"]

If I should indicate any additional information I will be happy to share.

Thanks for your help.

I’m not sure if HF has support for private packages. What kind of model are you trying to deploy?

Could be an error due to hugging face not accessing the docker image, can you verify it did so?

@grim-metal It will only support them if you provide a custom Docker Image. I am deploying a YOLO, but I will need a set of private utils that I would like to encode in the Docker image. Anyway apart from being able to do so, running the endpoint using a custom Docker Image provides another powerful advantage: you dont have to install custom dependencies from the requirements.txt every time a new replica is up. This consumes a significant amount of time that will harm the autoscaling speed. I need to upgrade PyTorch to v2, which means that the installation of that, and other requirements takes almost 3-5min to complete.

@David394 I am almost sure that the Docker Hub credentials are properly setup, anyway, I will double check that also.

If it’s a YOLO model, and you want it in a container behind a REST API, you could try using Modelbit instead. Private packages | Modelbit Documentation

@alex-bronze Were you able to figure out the contents of the custom Dockerfile?

Hi @amosyou!

Unfortunately I was not able to figure out the details… I was having issues with the autoscaling feature so I decided to give up an move to AWS Sagemaker Endpoints…

Anyway, if you find the details, would you mind to share them?

Thanks!

Hi @alex-bronze, not sure if this is helpful to you but I managed to deploy my own custom docker image. Essentially your docker image needs to start a server with a REST API that has at least a /health endpoint and one endpoint for serving your model/logic output. I wrote a short post about how to do with a simple fastapi server: https://www.linkedin.com/pulse/how-build-deploy-custom-docker-image-huggingface-sebastian-schramm-guoqe.

You can also take a look at my github repo with a minimal working example: GitHub - sebastianschramm/fastapi_hf_endpoints: Custom fastapi server packaged as docker image for Huggingface inference endpoints deployment