Error deploying BERT on SageMaker

I fine-tuned BERT for text-classification on a custom dataset using HuggingFace and Tensorflow, and now I’m trying to deploy the model for inference through SageMaker. I followed this HuggingFace tutorial but I get the following error. I spent a while looking through the SageMaker HuggingFace documentation to no avail. The error says that model_uri is set to None, but model_uri is not a parameter that I can pass, and I just want it to pull my model from the HuggingFace Hub.

I also tried downloading the model from the Hub, zipping it, uploading it to S3, and passing model_data=“model.tar.gz”, but that didn’t work either.

Any help would be greatly appreciated!

Resolved: I just needed to add an image_uri!

Hey @wsunadawong, I moved your post into the Amazon SageMaker category in the forum. I didn’t catch it when you posted it.

Quick question regarding your issue. Did you need to add image_uri to the HuggingFaceModel to run it? This should be the case with using the latest version of sagemaker. Could you please share how you deployed it?

Thanks for your help!

At first, I added image_uri to the HuggingFaceModel which worked. First, I tried using image_uris.retrieve(framework='huggingface',region='us-east-1', instance_type='ml.t2.medium',image_scope='inference',base_framework_version='tensorflow2.4.1'), but it gave an error that the image_uri could not be found, so then I went to this list of images and chose the image uri with the following properties: TensorFlow 2.4.1 with HuggingFace transformers, inference, CPU, py37.

Then I noticed that MultiDataModel requires launching from S3, so switched from HuggingFaceModel to Model in order to test pulling the model from the S3 bucket. Now that I look at it, it’s quite possible the model is not using the S3 model because I’m still passing in env=hub. (Could that explain why the Model works but MultiDataModel doesn’t?)

Update: using Model gives the same error as MultiDataModel when I remove env=hub as a parameter. I’m confused about this error, because my counterargument.tar.gz does contain a config.json file.

I saw you used from sagemaker.model import Model. We created a model class HuggingFaceModel, which you can use with out providing an image_uri.
Documentation to this is here: Hugging Face — sagemaker 2.49.2 documentation and Deploy models to Amazon SageMaker
and a small code snippet here

from sagemaker.huggingface import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
# deploy model to SageMaker Inference

We currently have an image for tensorflow 2.4.1.

I got the model to work with the following snippet.

How does your model.tar.gz look like?

Yes, so I used HuggingFaceModel before, and it did work successfully! The problem is that I want to run multiple models on the same endpoint. Unfortunately, I don’t know how to add a HuggingFaceModel to a MultiDataModel, which is why I was using a plain old Model instead. The only way I know how to add a model is by adding a path to the S3 bucket with add_model(model_data_source, model_data_path). (Here, I cannot specify that I want the model to be a HuggingFaceModel, so I assume it defaults to a regular Model, which is why it fails?) Are you able to get a MultiDataModel running with multiple HuggingFaceModels?

My counterargument.tar.gz is just a zipped version of my HuggingFace git repo:

Let’s move everything related to MultiDataModel to this Model works but MultiDataModel doesn't - #10 by dan21c so we can discuss it there.
If normal deployment works we can “close” this.
Or does normal deployment with your model.tar.gz not work?

Could you share how you created the archive of counterargument.tar.gz. Sometimes the structure is is

   - model
        - config.json

That way there is an extra folder and SageMaker might not recognize it.

I don’t think normal deployment is working for me. I tried running the code snippet you sent, and it gave me an error about needing to define a TASK.

When I added the HF_TASK in the env parameter, I got the same error as with the MultiDataModel: not being able to find the config.json.

I created the archive by cloning my HuggingFace repo, and running tar -czvf counterargument.tar.gz counterargument_hugging. Here’s the structure:

    - config.json
    - tf_model.h5
    - tokenizer.json
    - special_tokens_map.json
    - tokenizer_config.json
    - vocab.txt

I tried rerunning with the model subfolder structure that you suggested, but it just gave the same “config.json” error unfortunately.

Hey @wsunadawong,

the first image you shared is using your model_data and not the hub configuration.
Could you try it using the hub configuration if this works then the counterargument.tar.gz needs to be wrong.

Could you try creating the archive with the following steps?

  1. Download the model
git lfs install
git clone
  1. Create a tar file
cd counterargument_hugging
tar zcvf model.tar.gz *
  1. Upload model.tar.gz to s3
aws s3 cp model.tar.gz <s3://mymodel>
1 Like

That worked! The MultiDataModel works now! When I zipped the first time, I included the folder with tar zcvf model.tar.gz counterargument_hugging when I should’ve just included the folder contents. :man_facepalming: Thank you so much for your help. :blush:

1 Like