Databricks models deployment to sagemaker are not working

radsimu · April 18, 2023, 11:23pm

For the databricks models, none of the examples for deploying on Amazon SageMaker work. The suggested parameters for deploying as a Sagemaker endpoint are not working. I mean the endpoint deployment works but when trying to call the endpoint, it fails. The suggested parameters for instantiating the sagemaker. HuggingFaceModel seem off, and the instance type is deffinitely too weak.
After adjusting the suggested deployment configs to ‘transformers_version’: ‘4.26.0’, ‘pytorch_version’: ‘1.13.1’, and a more powerful endpoint type, now I am told that I need to set the option trust_remote_code=True when the pipeline is called from inside the docker that was deployed on the endpoint. How can I determine that parameter to be passed as true? If that is not doable, then none of the dolly models are usable as out of the box sagemaker deplyments

philschmid · April 19, 2023, 7:27am

Can you please share the code you used to deploy the model you are talking about? Hard to understand your issue without any details

radsimu · April 19, 2023, 4:05pm

the code is already there, it is suggested on the huggingface site. Go to any of the databricks models (ex databricks/dolly-v2-12b · Hugging Face) then in the upper right there is a Deploy button dropdown, click on that, select Amazon SageMaker (the only option actually), then choose pretty much anything (I tried: text generation on AWS). Try to run that code as is. Model will be deployed, but then it fails on predict:

cloudwatch logs:

com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py”, line 372, in getitem
com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise KeyError(key)
[INFO ] W-databricks__dolly-v2-3b-4-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - KeyError: ‘gpt_neox’

Then, fiddling with the suggested parameters seems to fix the gpt_neox keyError issue. These are the adjustments I made:

huggingface_model = HuggingFaceModel(
	transformers_version='4.26.0',
	pytorch_version='1.13.1',
	py_version='py39',
	env=hub,
	role=role,
)

But brings us to the following error (I also tried with a ml.g5.8xlarge instance type - same error):

com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py”, line 704, in pipeline
com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ValueError: Loading this pipeline requires you to execute the code in the pipeline file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This has to do with the way the pipeline is called from within the sagemaker docker image. For some reason, it is considered a “custom” pipeline and one has to explicitly set trust_remote_code=True when calling it. How can I get the already built and published docker image to pass trust_remote_code=True when calling the internal prediction pipeline? Or maybe I am doing something else wrong and that is not at all necessary.

mcapizzi · May 23, 2023, 11:00pm

Anybody figure this one out?

mcapizzi · May 23, 2023, 11:03pm

As expected, it looks like it gets a little hacky. But here is a solution.

philschmid · May 24, 2023, 6:24am

@mcapizzi the link you shared is the current best way. Since we are not having a way to tell the toolkit that you are okay with using “remote_code” let me add that to our backlog.

mcapizzi · May 24, 2023, 2:30pm

Thanks @philschmid ! I’m sure it’s impossible to accommodate every model and type so I’d understand if the subset of models that require “remote code” is so small that it’s not a priority. But I love the current functionality as it is! Got me up and running on some deployment tests in less than an hour.

Topic		Replies	Views
Serverless deploy troubles Amazon SageMaker	5	1447	May 16, 2022
Deploy big model to AWS Sagemaker fails Beginners	5	1079	July 31, 2023
Endpoint Deployment Amazon SageMaker	1	1109	September 20, 2021
Deploying Mixtral8x7B on AWS Sagemaker from S3 Amazon SageMaker	2	481	June 11, 2024
GPT-J fails on Amazon Sagemaker Models	2	1294	July 21, 2022

Databricks models deployment to sagemaker are not working

Related topics