Inference endpoint deployment with custom dockerfile

Hello everyone,

I want to create an inference endpoint with a custom dockerfile.
The last two lines of the dockerfile are:
EXPOSE 7860

CMD [“uvicorn”, “main:app”, “–host”, “0.0.0.0”, “–port”, “7860”] and the deployment process has just failed.
Could you please guide me on how to modify the CMD line correctly, considering I have a single handler.py in the repo? What adjustments should be made to ensure a successful deployment of the inference endpoint with the Dockerfile?

HI @nurcognizen! Did you find any solution to this problem of creating a custom Dockerfile?

Hi @nurcognizen, I managed to deploy my own custom docker image. Essentially your docker image needs to start a server with a REST API that has at least a /health endpoint and one endpoint for serving your model/logic output. I wrote a short post about how to do with a simple fastapi server: https://www.linkedin.com/pulse/how-build-deploy-custom-docker-image-huggingface-sebastian-schramm-guoqe.

You can also take a look at my github repo with a minimal working example: GitHub - sebastianschramm/fastapi_hf_endpoints: Custom fastapi server packaged as docker image for Huggingface inference endpoints deployment

1 Like