Emotion Model: Additional inference parameter not processed in Sagemaker inferentia instance

I am using the "bhadresh-savani/bert-base-go-emotion’ via AWS Sagemaker on inferentia instance.

The model card specifies an additional parameter return_all_scores which should return the scores for all seven emotions. This works out of the box with the huggingface transformer library, the Inference API but it does not work when hosted on AWS Sagemaker inferentia instance. Inferentia instance use old version of pytorch == 1.9 and transformer = 4.12. Updating to latest version of the pytorch and transformer not an option.

Do you know how to fix this issue?

from sagemaker.huggingface.model import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data=s3_model_uri,       
   role=role,                 
   transformers_version="4.12", 
   pytorch_version="1.9",      
   py_version='py37',            
)

# Let SageMaker know that we've already compiled the model via neuron-cc
huggingface_model._is_compiled_model = True

# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,     
    instance_type="ml.inf1.xlarge" 
)
data = {
  "inputs": "the mesmerizing performances of the leads keep the film grounded and keep the audience riveted",
    "parameters": {"return_all_scores": True}
}

res = predictor.predict(data=data)
res

o/p - [{'label': 'neutral', 'score': 0.7149022817611694}]

thank you

the inference.py is not having that parameter you need to update your inference.py to be able to return all scores.

1 Like