Can't deploy conversational HF model on AWS - Logs say model-path not a valid directory

rachelcorey · January 12, 2022, 3:45pm

I am using the given template code to make a SageMaker endpoint from my conversational HF model:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'rachelcorey/DialoGPT-medium-kramer',
	'HF_TASK':'conversational'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
        transformers_version='4.12',
        pytorch_version='1.9',
	py_version='py38',
	env={ 'HF_TASK':'conversational' },
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, # number of instances
	instance_type='ml.t2.medium' # ec2 instance type
)

The code has been revised based on the answers from this thread: Deploying a conversational pipeline on AWS - #3 by philschmid which is doing exactly what I want to do. I’ve installed the proper versions of pytorch and transformers.

When I run the code, I get this error:

UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-inference-2022-01-12-15-00-34-671: Failed. Reason:  The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

When I go into the logs, I see this message about 100000 times:

ERROR - Given model-path /opt/ml/model is not a valid directory. Point to a valid model-path directory.

And also this message a couple times, but I assume it’s related to the previous error:

subprocess.CalledProcessError: Command '['model-archiver', '--model-name', 'model', '--handler', 'sagemaker_huggingface_inference_toolkit.handler_service', '--model-path', '/opt/ml/model', '--export-path', '/.sagemaker/mms/models', '--archive-format', 'no-archive', '--f']' returned non-zero exit status 1.

I tried to create this directory in opt/ml/ before running the code but it has no effect on the issue it seems.

According to some research, this is the directory that SageMaker puts trained models… However, my model is on HF, not trained by SageMaker… should I put the model files in that directory? Not sure what to do here. Do I ask AWS support for help with this instead of posting here? Any help is much appreciated, thank you in advance!

philschmid · January 12, 2022, 5:35pm

Hey @rachelcorey,

you are not passing in your hub configuration containing your model_id into the HuggingFaceModel meaning you are trying to create an endpoint without a model at all.

you just need to change
env={ 'HF_TASK':'conversational' } => env=hub

rachelcorey · January 13, 2022, 12:59am

OMG, thank you so much! I feel so silly but thank you for helping me out! <3

rachelcorey · January 13, 2022, 2:30pm

I can extra confirm this solved my issue, @philschmid ! I’m up and running now, no one can stop me! Thanks again!

philschmid · January 13, 2022, 3:22pm

Great to hear! let us know if you have any other questions

Topic		Replies	Views
Need help deploying a HF model to AWS Sagemaker Amazon SageMaker	3	149	September 27, 2024
Deploying a conversational pipeline on AWS Amazon SageMaker	9	4294	July 13, 2023
Getting error in the inference stage of Transformers Model (Hugging Face) 🤗Transformers	0	780	October 11, 2022
Deploy big model to AWS Sagemaker fails Beginners	5	1079	July 31, 2023
Error deploying endpoint on Aws Models	6	194	August 23, 2024

Can't deploy conversational HF model on AWS - Logs say model-path not a valid directory

Related topics