Deploying Huggingface Sagemaker Models with Elastic Inference

schopra · July 27, 2021, 5:04pm

When I try to deploy a HuggingFace Sagemaker model with elastic inference (denoted by the accelerator_type parameter) I get an error.

Deploy Snippet:

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.t2.medium",
    accelerator_type='ml.eia2.medium'
)

Error Msg:

~/miniconda3/envs/ner/lib/python3.8/site-packages/sagemaker/image_uris.py in _validate_arg(arg, available_options, arg_name)
    305     """Checks if the arg is in the available options, and raises a ``ValueError`` if not."""
    306     if arg not in available_options:
--> 307         raise ValueError(
    308             "Unsupported {arg_name}: {arg}. You may need to upgrade your SDK version "
    309             "(pip install -U sagemaker) for newer {arg_name}s. Supported {arg_name}(s): "

ValueError: Unsupported image scope: eia. You may need to upgrade your SDK version (pip install -U sagemaker) for newer image scopes. Supported image scope(s): training, inference.

The model deploys successfully if I do not provide an accelerator (i.e., no Elastic Inference).

Do the HuggingFace Sagemaker models support EI? If yes, how might I deploy the model successfully with EI? And if not, is EI support on the roadmap?

Much thanks in advance!

philschmid · July 28, 2021, 6:46am

Hey @schopra,

Sadly speaking we don’t have EI DLCs yet. We are working on it and it is on the roadmap with one of the highest priorities.
I would update this thread here when I got any news.

ujjirox · September 3, 2021, 4:35pm

Is there by any chance a list of supported instances at this time? Thanks!

philschmid · September 5, 2021, 9:32am

Hey @ujjirox,

supported instances for what? Training or Inference or both? You can find an overview of supported instances type for sagemaker here: Amazon SageMaker Pricing – Amazon Web Services (AWS)

ujjirox · September 5, 2021, 4:14pm

Sorry! Should have been more clear. I meant for inference. I actually had tried running inference with ml.inf1.xlarge but it didn’t seem to work, hence the question.

Thanks.

philschmid · September 6, 2021, 6:19am

Hey @ujjirox,

Inferentia is also not yet supported, since we need to create a separate DLC for the Inferentia instances, but we are on it.
Other than this every CPU / GPU machine should be supported.

asafab · December 28, 2021, 1:22pm

Hey, any news regarding the EI DLC/ INF DLC?

philschmid · December 29, 2021, 10:21am

Hey @asafab,

Yes, i already opened a PR for INF DLCs:
You can follow it here: [huggingface_pytorch][NEURON][build] Huggingface Neuron inference DLC by philschmid · Pull Request #1578 · aws/deep-learning-containers · GitHub

When it is merged and available we will additionally share on social media + provide an example.

asafab · December 29, 2021, 10:35am

very cool!
thanks for the response, you have news about the EI DLC too?

YannAgora · January 27, 2022, 10:38am

Hey there !!

The PR regarding INF DLCs seems to have been merged, does it mean ml.inf* instance family can now be used with HuggingFace models ?

philschmid · January 27, 2022, 10:56am

They are merged but yet not released. I hope they will be available in the next 2 weeks. We ll let you know on social media.

YannAgora · March 18, 2022, 3:50pm

Hello @philschmid,

I read your article https://www.philschmid.de/huggingface-bert-aws-inferentia about hugging face model deployment on inferentia instance (very good and clear btw).

Can this method be used for all model types and all tasks ? I particularly think to Seq2Seq models (Bart, Pegasus, T5) for Summarization task

philschmid · March 21, 2022, 12:53pm

Hello @YannAgora,

Yes, it can also be used for T5 or pegasus. You can find more documentation here: Transformers MarianMT Tutorial — AWS Neuron documentation.
You can use the NeuronGeneration code inside the inference.py then.

Vinayaks117 · May 18, 2022, 11:42am

Hi @philschmid

Any updates on EI (Elastic Inference) DLCs for inference?
Can we start using EI accelerators with HuggingFace models for inference?

Thanks

philschmid · May 18, 2022, 1:10pm

What is the error you are seeing when you are trying to deploy an EI backed endpoint?

Vinayaks117 · May 24, 2022, 4:17am

@philschmid As requested, please find the error details below.

ValueError: Unsupported image scope: eia. You may need to upgrade your SDK version (pip install -U sagemaker) for newer image scopes. Supported image scope(s): training, inference.

Sagemaker SDK version = 2.87.0

huggingface_model = HuggingFaceModel(
model_data=model_data, # path to your trained SageMaker model
role=get_execution_role(), # IAM role with permissions to create an endpoint
entry_point=‘deploy_ei.py’,
transformers_version=“4.12.3”, # Transformers version used
pytorch_version=“1.9.1”, # PyTorch version used
py_version=‘py38’, # Python version used
sagemaker_session=sagemaker_session
)

Vinayaks117 · June 21, 2022, 12:50pm

Hello @philschmid

Any updates on this issue and EI (Elastic Inference) DLCs for inference? Thanks

philschmid · June 22, 2022, 12:28pm

Let me reach out to the AWS team again. I ll report back here as soon as I hear something.

Out of curiosity why are you interested in EIA and not using Inferentia?

Vinayaks117 · June 22, 2022, 3:17pm

We are interested in cost effective solution and also interested in hosting multiple models in one container.
But I think we can not host multiple models in one container behind one endpoint with both elastic inference and Inferentia but it’s possible with only cpu based instances. Thanks

Vinayaks117 · October 15, 2022, 2:20pm

Hey @philschmid

Any updates on this issue and EI (Elastic Inference) DLCs for inference?

Thanks

Topic		Replies	Views
About the Amazon SageMaker category Amazon SageMaker	25	4120	August 5, 2021
How do I deploy a hub model to SageMaker and give it a GPU (not Elastic Inference)? Amazon SageMaker	4	3391	February 15, 2022
Help for inference.py code Amazon SageMaker	10	4009	March 8, 2022
Deployed HF model from the hub and got an error: 'numpy.ndarray' object has no attribute 'pop' Amazon SageMaker	6	1922	September 15, 2021
Inference failed for FLAN-UL2(20B) on SageMaker Amazon SageMaker	6	2181	April 4, 2023

Deploying Huggingface Sagemaker Models with Elastic Inference

Related topics