I am trying to use Inference Endpoints for model deployment in production. I have already succeed in deploying a model using the Default Container type. This means that I already have all the necesary setup including a custom Handler.py working.
The issue is that, the necessary final step involves installing a given package that is private. This means that it will not be properly installed from the requirements.txt. I saw that one can indicate a custom Docker Image which will solve easily this issue. However, I have been triying and never been able to initialize the Endpoint. It does not even show any logs.
What should be the content of the Dockerfile? I was assuming something like this:
ADD . /repository
# EXPOSE PORT XXX
CMD ["python", "handler.py"]
If I should indicate any additional information I will be happy to share.
Thanks for your help.
I’m not sure if HF has support for private packages. What kind of model are you trying to deploy?
Could be an error due to hugging face not accessing the docker image, can you verify it did so?
@grim-metal It will only support them if you provide a custom Docker Image. I am deploying a YOLO, but I will need a set of private utils that I would like to encode in the Docker image. Anyway apart from being able to do so, running the endpoint using a custom Docker Image provides another powerful advantage: you dont have to install custom dependencies from the requirements.txt every time a new replica is up. This consumes a significant amount of time that will harm the autoscaling speed. I need to upgrade PyTorch to v2, which means that the installation of that, and other requirements takes almost 3-5min to complete.
@David394 I am almost sure that the Docker Hub credentials are properly setup, anyway, I will double check that also.
If it’s a YOLO model, and you want it in a container behind a REST API, you could try using Modelbit instead. Private packages | Modelbit Documentation