ValueError: Unsupported model type mllama

giridhar0813 · October 23, 2024, 1:18pm

I am looking to deploy “meta-llama/Llama-3.2-11B-Vision-Instruct” to AWS SageMaker but getting below error while deploying.

Traceback (most recent call last):
  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 118, in serve
    server.serve(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 297, in serve
    asyncio.run(
  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 231, in serve_inner
    model = get_model(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 1117, in get_model
    raise ValueError(f"Unsupported model type {model_type}")

sagemaker config

instance_type = “ml.g5.4xlarge”
number_of_gpu = 1
health_check_timeout = 1000

Define Model and Endpoint configuration parameter

config = {
‘HF_MODEL_ID’: “meta-llama/Llama-3.2-11B-Vision-Instruct”,
‘SM_NUM_GPUS’: json.dumps(number_of_gpu), # Number of GPU used per replica
‘MAX_INPUT_LENGTH’: json.dumps(6000), # Max length of input text
‘MAX_TOTAL_TOKENS’: json.dumps(8192), # Max length of the generation (including input text)
}

giridhar0813 · October 23, 2024, 2:13pm

I am using container - 763104351884.dkr.ecr.eu-central-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.3.0-gpu-py310-cu121-ubuntu22.04.

Should I be using different container ?

John6666 · October 23, 2024, 3:12pm

It looks like 2.3.0 is still compatible with Llama 3.2. Do other models work?

github.com

huggingface/text-generation-inference/blob/v2.3.0/server/text_generation_server/models/init.py

# ruff: noqa: F821
# the above line disables the `undefined-name` rule for the model type variables

import torch
import enum
import os

from loguru import logger
from transformers.configuration_utils import PretrainedConfig
from transformers.models.auto import modeling_auto
from huggingface_hub import hf_hub_download, HfApi
from typing import Optional, List, Dict
from pathlib import Path

from text_generation_server.utils.speculate import get_speculate, set_speculate
from text_generation_server.models.model import Model
from text_generation_server.models.causal_lm import CausalLM, CausalLMBatchKeysLast
from text_generation_server.models.custom_modeling.opt_modeling import OPTForCausalLM
from text_generation_server.models.custom_modeling.mpt_modeling import (
    MPTForCausalLM,

This file has been truncated. show original

giridhar0813 · October 23, 2024, 4:34pm

Above issue is resolved after using tgi container - 2.4

Topic		Replies	Views
Issue - ValueError: Unsupported model type mixtral Amazon SageMaker	1	1115	December 28, 2023
Error loading finetuned llama2 model while running inference Amazon SageMaker	27	4818	September 20, 2023
Mistral AI Sagemaker deployment failing Amazon SageMaker	3	2072	December 29, 2023
Error hosting endpoint when deploying model Amazon SageMaker	2	3077	March 27, 2024
Text-generation-inference: "You are using a model of type llama to instantiate a model of type ." Models	5	7619	November 3, 2023

ValueError: Unsupported model type mllama

sagemaker config

Define Model and Endpoint configuration parameter

Related topics