InternalServerException when running a model loaded on S3

vicente-m47 · August 5, 2021, 4:16pm

Hi there,

I am trying to deploy a model loaded on S3, following the steps found mainly on this video: [Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker](https://www.youtube.com/watch?v=pfBGgSGnYLs).

For that I have downloaded a model into a S3 bucket and use this image URI for DLC: image_uri = “763104351884.dkr.ecr.eu-west-1.amazonaws.com/huggingface-pytorch-inference:1.7.1-transformers4.6.1-cpu-py36-ubuntu18.04”

When I run the predictor.predict(data) command, I get this error:

The model I use fot these tests is this one: dccuchile/bert-base-spanish-wwm-uncased, and I could not find the way for letting the model know which action should perform.

I am pretty new with HuggingFace technology, and probably I am missing the point for fixing that.

Please, could you let me know what should I do for informing the model about what to do?

Thank you!

philschmid · August 5, 2021, 4:32pm

Hey @vicente-m47 ,

Could you please provide the whole code you executed?

If you want to deploy a model from Models - Hugging Face you can use the “deploy” button on each of the model pages.

This will generate a code snippet for you

From reading the code you attached you are trying to send an input for question-answering to a model (dccuchile/bert-base-spanish-wwm-uncased), which is not fine-tuned for question-answering, also you are sending an English input to a Spanish model.

A good starting point for new Hugging Facer is our course at Introduction - Hugging Face Course.
You can find more information about deploying to sagemaker in the documentation here Deploy models to Amazon SageMaker

vicente-m47 · August 6, 2021, 7:04am

Hi, @philschmid,

Sorry, you are right, my question was awfully written. Here is the code I used:

!pip install sagemaker --upgrade

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()

# This model was supposedly trained from the dccuchile Bert model for QA.
model = "s3://xxxxxxx/hg_model.tar.gz"
image_uri = "763104351884.dkr.ecr.eu-west-1.amazonaws.com/huggingface-pytorch-inference:1.7.1-transformers4.6.1-cpu-py36-ubuntu18.04"

huggingface_model = HuggingFaceModel(
    model_data=model,
    transformers_version='4.6.1',
    pytorch_version='1.7.1',
    py_version='py36',
    image_uri=image_uri,
    role=role, 
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.xlarge"
)

data = {
    "inputs": {
        "question": "¿Cómo me llamo?",
        "context": "Mi nombre es Juan y vivo en Francia"
    }
}

predictor.predict(data)

Reading your comment, I understand that the problem should be on the trained model. I would review the course you recommend and check the training I did.

Thank you!

philschmid · August 6, 2021, 7:53am

You don’t need to provide an image_uri the HuggingFaceModel will select the right one based on pytorch_version and transformers_version.

vicente-m47 · August 6, 2021, 9:39am

Yes, you’re right. Thank you, @philschmid.

Topic		Replies	Views
Getting error in the inference stage of Transformers Model (Hugging Face) 🤗Transformers	0	782	October 11, 2022
InternalServerException from bart model created from s3 Amazon SageMaker	1	389	May 22, 2023
Error deploying BERT on SageMaker Amazon SageMaker	20	5286	April 1, 2025
HuggingFaceModel loading model data from us-east-2 (?) Amazon SageMaker	4	691	January 27, 2024
Error loading finetuned llama2 model while running inference Amazon SageMaker	27	4801	September 20, 2023

InternalServerException when running a model loaded on S3

Related topics