Sagemaker Serverless Inference for LayoutLMv2 model

Hi Elman, thanks for opening this thread, this is a super interesting topic :slight_smile:

No matter what model you deploy to a Sagemaker (SM) endpoint, the input always requires preprocessing before it can be passed to the model. The reason you can just pass some text in the case of DistilBert without having to do the processing yourself is that the SageMaker Hugging Face Inference Toolkit does all the work for you. This toolkit on builds on top of the Pipeline API, which is what makes it so easy to call.

What does that mean for you when you want to use a LayoutLMV2 model? I see two possibilities:

  1. The Pipeline API offers a class for Object Detection: Pipelines. I’m not familiar with it but I would imagine that it is quite straightforward to use. Again, because the Inference Toolkit is based on Pipelines, once you figure out how to use the Pipeline API for Object Detection you can use the same call for the SM Endpoint

  2. The Inference Toolkit also allows you to provide your own preprocessing script, see more details here: Deploy models to Amazon SageMaker. That means you can process the inputs yourself before passing it to the model. What I would do (because I’m lazy) is to just look at an already existing demo to see how the preprocessing for a LayoutLMV2 model works. For example this one: app.py · nielsr/LayoutLMv2-FUNSD at main, and use that.

Hope this helps, please let me know how it goes and/or reach out if any questions.

Cheers
Heiko

2 Likes