I am working on deploying a speech-recognition app using HuggingFace following the instructions here. My understanding is that the inference toolkit uses pipelines, but the speech-recognition is only introduced with the > 4.9.0 releases, whereas the current AWS images are pointing to 4.6.x.

Is there any way around this? What do you think suggest that I do to make the deployment work? My hunch is that I need to supply a new image_uri.

Hello @dzorlu,

Great to hear that you are working on a speech task!! Yes, the inference toolkit uses the pipelines from transformers. The code is open source you want to take a deeper look GitHub - aws/sagemaker-huggingface-inference-toolkit.

I am happy to share that we are working on new releases for the DLC, which include 4.9 and higher. Sadly I think it will take around 2 more weeks to be around.

In the meantime, you could use the official DLC and provide as model_data a model.tar.gz which contains a custom module, documented here: Deploy models to Amazon SageMaker
With a custom module, you can provide a requirements.txt to upgrade the dependencies and then provide a with a custom model_fn to load the asr pipeline.

Thank you Philipp for the quick response and the direction. I will try the model_data route. It’s also great to hear that you are working on updating the DLCs. The team’s work is always much appreciated!