Hey @JB2022,
we added support for conversational
pipeline with a later release. Can you use instead of transformers_version="4.6"
=> 4.12
and for pytorch_version="1.7"
=> 1.9
.
You can find the whole list of available containers here: Reference
Then your fist code snippet should work.
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'microsoft/DialoGPT-medium',
'HF_TASK':'conversational'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.12',
pytorch_version='1.9',
py_version='py36',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)
predictor.predict({
'inputs': {
"past_user_inputs": ["Which movie is the best ?"],
"generated_responses": ["It's Die Hard for sure."],
"text": "Can you explain why ?",
}
})