Deployment issue in AWS Sagemaker and GCP

Hi Team,

Tried deploying the Starcoder2-15B model both in AWS Sagemaker and GCP.

In both the platforms, the deployment is failing with the below error:

raise ValueError(\nValueError: The checkpoint you are trying to load has model type starcoder2 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

For Sagemaker, followed the same step mentioned in the Deployment tab.


get_huggingface_llm_image_uri(‘huggingface’,
version=“1.4.2”, session=sess)

‘model_id’ : ‘bigcode/starcoder2-15b’,
‘instance_type’ : ‘ml.g5.2xlarge’,
‘num_gpus’ : ‘1’,

hp = hyperparameters(config)

create HuggingFaceModel with the image uri

llm_model = HuggingFaceModel(
role=role,
image_uri=llm_model_image,
env=hp
)

estimator = llm_model.deploy(
initial_instance_count=1,
instance_type=config[‘instance_type’],
endpoint_name=config[‘endpoint_name’],
container_startup_health_check_timeout=600, # 10 minutes to be able to load the model
)


In a stand alone notebook, I am able to download the model using Transformer 4.39.2 version.

Could you please help in deploying this model in Sagemaker/ GCP.