The expanded size of the tensor (22528) must match the existing size (1024) at non-singleton dimension 0

There is error below when trying to deploy HF model to Amazon SageMaker.

Error:

RuntimeError: The expanded size of the tensor (22528) must match the existing size (1024) at non-singleton dimension 0.  Target sizes: [22528, 8192].  Tensor sizes: [1024, 8192]

SageMaker Instance: ml.g4dn.2xlarge

Code to deploy HF model to Amazon SageMaker:

import os
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

hub = {
    'HF_MODEL_ID': 'meta-llama/Llama-2-7b-chat-hf',
    'SM_NUM_GPUS': json.dumps(1),
    'HUGGING_FACE_HUB_TOKEN': <HF_TOKEN>
}

huggingface_model = HuggingFaceModel(
    image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"),
    env=hub,
    role=role
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.2xlarge",
    container_startup_health_check_timeout=1800
  )
1 Like