About the Amazon SageMaker category

This category is for any questions related to using Hugging Face Transformers with Amazon SageMaker. Don’t forget to check the announcement blogpost for more resources.

Thanks for this amazing project, definitely HuggingFace, and Sagemaker, both are the leading in their particular domains, and integrating both, will definitely enhance their effectiveness.

Is it currently possible to deploy real-time endpoints with Sagemaker, using Huggingface?


Hey @m-ali-awan,

thank you for the feedback :hugs:

We are currently working on a nice way to deploy all of the Hugging Face models to SageMaker. But this still takes a little time. In the meantime, you could use either the tensorflow or pytorch inference toolkit.

Thanks for responding.
Are there any example nbs available for deploying?

And, I will be very grateful if you can guide me about this:
I am trying to build custom Document-Classifier with HuggingFace, but my client is currently using Amazon-Comprehend, so is it possible to come up with better classifer than Comprehend, as we have less data:i.e 50/class, and total 20 classes?


Hi @m-ali-awan , thanks a lot for reaching out! you can find HF deployment examples here:

Note that the SageMaker hosting experience varies depending on your version of PyTorch (MMS backend or TorchServe backend) Use PyTorch with the SageMaker Python SDK — sagemaker 2.39.1 documentation

It’s not possible to tell you whether HF or Comprehend will give you better results because half the answer is in the developer hands :slight_smile: : it depends on the data, the model, its training (epochs, optimizers…). Using your own Hugging Face code in SageMaker, you will indeed have more freedom from a science and system architecture standpoint (free to inspect the model, export it out of AWS, play with its weights, test various backends and tasks etc), but be aware that with more freedom comes more responsibility: model science and infrastructure becomes your ownership. In Comprehend, more things are managed, with a concession on development freedom.

Thanks alot.

Another option would be to upload your fine-tuned model to the Hugging Face Hub as either a private or a public model and then use it with the :hugs: Accelerated Inference API. You can test the API for free or go with a plan that fits you and your customer. You can compare it to Comprehend that it is managed, but it is easier to provide a custom model and benefit from the accelerations and optimizations the :hugs: Accelerated Inference AP is doing.

1 Like

Thanks for this option.
So, curently we are using Textract for OCR, and we want to use .txts from this pipeline, to be fed to Documnet-Classification, so can we integrate this API into AWS Lambda, or what should be the way to go?


Yes, you could create a Python AWS Lambda function to read the .txt files. Then depending on how big the documents are splitting them into passages and then send the documents to the :hugs: Inference API with a POST HTTP-Request.
You can the documentation here.

Thanks, so is there any limit on the no of characters in a document, and if how can we cater for(increase) it.


The limit of Tokens is depending on the model you use. For example bert-base-case has a max_length of 512 same as bert-large-uncased.

Ok, thanks, and when we are training for custom classes, we simply can increase these?
And definitely, there would be some memory limitations, so how to cater this at inference for relatively large documents?


It is not that simple to increase those. But there are Transformer models like Longfromer who have a max token length of 4096. Also available on the Hub. allenai/longformer-base-4096 · Hugging Face

Thanks a lot, @philschmid,

I have already teased you a lot, but sorry, I am not experienced enough with HuggingFace.

So, can we train this Longformer, for custom case?


Hi philishmid, hope you are fine.
Do we have support for Relation-Extraction models, to be fine-tuned with sagemaker.
If yes, kindly share any relevant notebook link.

Thanks a lot…

Please, share with me any resources(colab notebooks) related to relationship-extraction.

@m-ali-awan, You can find all example we currently have in these 4 different resources

Thanks alot, but do it contains examples for Relationship-Extraction?

Or any new state-of-the-art NER?

@m-ali-awan yes the community notebooks, as well as the example scripts, include examples for Named-Entity-Recognition.
On the Hub, we have ~800 models trained for token classification. Take a look and see if one of these fits your use-case Hugging Face – The AI community building the future.