Mistral AI Sagemaker deployment failing

mus-shd · October 4, 2023, 1:25pm

Hi
Trying to deploy mistralai/Mistral-7B-Instruct-v0.1 model on AWS Sagemaker using the Hugging Face LLM DLC. Here is my code:

import json
from sagemaker.huggingface import HuggingFaceModel
from sagemaker.huggingface import get_huggingface_llm_image_uri

# retrieve the llm image uri
llm_image = get_huggingface_llm_image_uri(
  "huggingface",
  version="1.0.3"
)

# sagemaker config
instance_type = "ml.g5.2xlarge"
number_of_gpu = 1
health_check_timeout = 600

# Define Model and Endpoint configuration parameter
config = {
  'HF_MODEL_ID': "mistralai/Mistral-7B-Instruct-v0.1",
  'SM_NUM_GPUS': json.dumps(number_of_gpu),
  'MAX_INPUT_LENGTH': json.dumps(2048),
  'MAX_TOTAL_TOKENS': json.dumps(4096),
  'MAX_BATCH_TOTAL_TOKENS': json.dumps(8192),
  'HUGGING_FACE_HUB_TOKEN': "hf_****"
}

# create HuggingFaceModel with the image uri
llm_model = HuggingFaceModel(
  role=role,
  image_uri=llm_image,
  env=config,
code_location=f"s3://{S3_BUCKET}/"
)

llm = llm_model.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    endpoint_name="mistral-7b-instruct",
    container_startup_health_check_timeout=health_check_timeout,
    tags=[{"Key": "ENV", "Value": "dev"}]
)

The deployment fails with the following errors:

Download encountered an error: Traceback (most recent call last):
  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 195, in download_weights
    utils.convert_files(local_pt_files, local_st_files, discard_names)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 106, in convert_files
    convert_file(pt_file, sf_file, discard_names)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 68, in convert_file
    to_removes = _remove_duplicate_names(loaded, discard_names=discard_names)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 32, in _remove_duplicate_names
    raise RuntimeError(

RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.norm.weight'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.

Any idea how to solve this issue and make the deployment work? Thanks in advance for your help

philschmid · October 4, 2023, 1:52pm

Mistral is not supported in 1.0.3 since the model came afterwards, please try 1.1.0

mus-shd · October 4, 2023, 2:20pm

Lot of thanks for the quick reply Indeed, with version 1.1.0 it is working like a charm!

piyushaaryan011 · December 29, 2023, 8:51pm

hey , i have trained zephyr-7b on qlora. now i am trying to deploy to the endpoint .
but got this error:-

it’s saying unsupported model type mistral , can you share what i can do to solve this.

Topic		Replies	Views
Error hosting endpoint when deploying model in sagemaker Models	0	88	July 20, 2024
Deploying Mixtral8x7B on AWS Sagemaker from S3 Amazon SageMaker	2	481	June 11, 2024
Issue - ValueError: Unsupported model type mixtral Amazon SageMaker	1	1098	December 28, 2023
Vicuan error on Sagemaker Amazon SageMaker	3	829	October 23, 2024
Error when deploying GPT4-Alpaca on Sagemaker via HF model hub Beginners	8	1329	July 11, 2023

Mistral AI Sagemaker deployment failing

Related topics