I’ve successfully (I had to use an Inf1 sagemaker notebook so I couldn’t run the validation) compiled my own fine-tuned model: csabakecskemeti/bert-base-case-yelp5-tuned-experiment · Hugging Face, and deployed it in on Inf2.xlarge with optimum-neuron. The model has a 5 label classification head and fine-tuned on the yelp5 dataset.
When I’ev tried to predict, received the following error:
message": "Could not load model /.sagemaker/mms/models/csabakecskemeti__bert-base-case-yelp5-tuned-experiment with any of the following classes: (\u003cclass \u0027transformers.models.auto.modeling_auto.AutoModelForSequenceClassification\u0027\u003e, \u003cclass \u0027transformers.models.bert.modeling_bert.BertForSequenceClassification\u0027\u003e)."
Deployment code:
from sagemaker.huggingface.model import HuggingFaceModel
# HF_TASK list https://huggingface.co/docs/transformers/main_classes/pipelines
config = {
"HF_MODEL_ID": "csabakecskemeti/bert-base-case-yelp5-tuned-experiment", # model_id from hf.co/models
"HF_TASK": "text-classification", # NLP task you want to use for predictions
"HF_BATCH_SIZE": "1", # batch size used to compile the model
"MAX_BATCH_SIZE": "1", # max batch size for the model
"HF_SEQUENCE_LENGTH": "128", # length used to compile the model
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
env=config,
model_data=s3_model_uri, # path to your model and script
role=role, # iam role with permissions to create an Endpoint
transformers_version="4.28.1", # transformers version used
pytorch_version="1.13.0", # pytorch version used
py_version='py38', # python version used
model_server_workers=2, # number of workers for the model server
)
# Let SageMaker know that we've already compiled the model
huggingface_model._is_compiled_model = True
# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type="ml.inf2.xlarge" # AWS Inferentia Instance
)
If I changed the HF_MODEL_ID to the original bert: “google-bert/bert-base-cased”, the text classification started working, but the results were not correct obviously.
Does anyone have any hint what’s going on?
Side note: I also found that the Inference Examples have stopped working on my model card, and experienced similar for other “text-classification” models. I don’t know if it’s related it’s a “text-classification” pipeline problem or specific to my model.
All suggestions are welcomed and appreciated