Mistral AI Sagemaker deployment failing

Hi :wave:
Trying to deploy mistralai/Mistral-7B-Instruct-v0.1 model on AWS Sagemaker using the Hugging Face LLM DLC. Here is my code:

import json
from sagemaker.huggingface import HuggingFaceModel
from sagemaker.huggingface import get_huggingface_llm_image_uri

# retrieve the llm image uri
llm_image = get_huggingface_llm_image_uri(
  "huggingface",
  version="1.0.3"
)

# sagemaker config
instance_type = "ml.g5.2xlarge"
number_of_gpu = 1
health_check_timeout = 600

# Define Model and Endpoint configuration parameter
config = {
  'HF_MODEL_ID': "mistralai/Mistral-7B-Instruct-v0.1",
  'SM_NUM_GPUS': json.dumps(number_of_gpu),
  'MAX_INPUT_LENGTH': json.dumps(2048),
  'MAX_TOTAL_TOKENS': json.dumps(4096),
  'MAX_BATCH_TOTAL_TOKENS': json.dumps(8192),
  'HUGGING_FACE_HUB_TOKEN': "hf_****"
}

# create HuggingFaceModel with the image uri
llm_model = HuggingFaceModel(
  role=role,
  image_uri=llm_image,
  env=config,
code_location=f"s3://{S3_BUCKET}/"
)

llm = llm_model.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    endpoint_name="mistral-7b-instruct",
    container_startup_health_check_timeout=health_check_timeout,
    tags=[{"Key": "ENV", "Value": "dev"}]
)

The deployment fails with the following errors:

Download encountered an error: Traceback (most recent call last):
  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 195, in download_weights
    utils.convert_files(local_pt_files, local_st_files, discard_names)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 106, in convert_files
    convert_file(pt_file, sf_file, discard_names)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 68, in convert_file
    to_removes = _remove_duplicate_names(loaded, discard_names=discard_names)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 32, in _remove_duplicate_names
    raise RuntimeError(
RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.norm.weight'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.

Any idea how to solve this issue and make the deployment work? Thanks in advance for your help :pray:

Mistral is not supported in 1.0.3 since the model came afterwards, please try 1.1.0

2 Likes

Lot of thanks for the quick reply :pray: Indeed, with version 1.1.0 it is working like a charm!

hey , i have trained zephyr-7b on qlora. now i am trying to deploy to the endpoint .
but got this error:-


it’s saying unsupported model type mistral , can you share what i can do to solve this.