I am deploying a Huggingface model with Sagemaker. This is the code that I use to deploy:
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel
sess = sagemaker.Session()
role = sagemaker.get_execution_role()
client = boto3.client('sagemaker')
def new_endpoint(name, model_path, instance):
# create HuggingFaceModel
model = HuggingFaceModel(
model_data=model_path+"output/model.tar.gz",
role=role,
transformers_version='4.28',
pytorch_version='2.0',
py_version='py310',
)
# Deploy model to an endpoint
deployed = model.deploy(
initial_instance_count=1,
instance_type=instance,
endpoint_name=name,
)
This has previously worked without issue, but in the last few days, the model no longer successfully deploys, even though I have not modified the code. Looking in the CloudWatch logs, I see that this specific line causes the error:
2024-06-11T13:30:54,406 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - from sagemaker_huggingface_inference_toolkit.transformers_utils import (
…
2024-06-11T13:30:54,406 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - from transformers.pipelines import Conversation, Pipeline
I have looked at the source code of the transformer_utils.py file, and the line that causes this error can be seen:
from huggingface_hub import HfApi, login, snapshot_download
from transformers import AutoTokenizer, pipeline
from transformers.file_utils import is_tf_available, is_torch_available
from transformers.pipelines import Conversation, Pipeline
I am unable to resolve this issue. Testing it locally, this line does not work for me either. What surprises me the most is that this error has occurred spontaneously, without the sagemaker-huggingface-inference-toolkit or transformers libraries having been updated.
Any help would be appreciated, thanks.