This category is to ask questions about deploying Hugging Face Hub models for real-time inference in Azure Machine Learning using the new Hugging Face model catalog.
We released on May 23, 2023 a new experience to easily deploy Hugging Face models in Azure Machine Learning, introducing a new model catalog natively integrated within the AzureML Studio.
You can browse compatible models directly in Azure Machine Learning Studio, or use the “Deploy in Azure Machine Learning” menu on model pages on Models - Hugging Face. Initially this experience is focused on Transformers models for NLP tasks with available PyTorch checkpoints.
Super excited about this! So, busy trying to add an endpoint for a private wav2vec2 model on the hub.
I create the Managed Application and it’s all fine.
I am able to get through the forms for adding the endpoint including the validation
I receive this error:
“The resource provider ‘public’ received a non-success response ‘InternalServerError’ from the downstream endpoint for the request ‘PUT’ on ‘HuggingFace.Endpoint/516a0116-6c12-4bd5-8de5-939c798d32c8’. Please refer to additional info for details.”
I did the same with a public wav2vec2 model (xjonatasgrosman/wav2vec2-large-xlsr-53-english) on the same infrastructure, and it succeeded.
I don’t suppose there is a way to use private models yet?
Update >> Deploying a public wav2vec2 model did not fully succeed. It resulted in this status: “Deployment not found”
Hi @alienelf ! Thanks for your question. Indeed private models was a limitation of the Managed Application (which would have needed HF authentication to pull private models). We have deprecated the Managed App (unlisted from the Azure Marketplace) since, and the new integration to deploy HF models in Azure ML Studio is the Model Catalog that was recently introduced (Wav2Vec2 is not yet supported).
Thank you! Any ETA on when it might be supported?
I’m trying to deploy a model that is NOT listed on the model catalog of Azure. Specifically deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face
given that the model isn’t part of model catalog I want my deployment to download the model from the huggingface but it isn’t working. Is there a guide for this use case? Microsoft’s docs only covers the scenario where the model is part of the catalog Deploy models from HuggingFace hub to Azure Machine Learning online endpoints for real-time inference - Azure Machine Learning | Microsoft Learn