Key Error when trying to deploy inference endpoint

Hey folks, super noob here trying to deploy a model to an inference endpoint to try it out. Here’s the model I’m trying to deploy truehealth/LLaVar · Hugging Face and the endpoint hardware configuration I’m deploying on is GPU · Nvidia Tesla T4 · 4x GPU · 16 GB.

I’m seeing the following error in the endpoint logs on startup. Does anyone know what the issue could be? Is there anything I can do on my end to fix this? Thank you so much in advance!

2023/11/03 10:26:40 ~ INFO | No custom pipeline found at /repository/handler.py
2023/11/03 10:26:40 ~ 2023-11-03 17:26:40,161 | INFO | Initializing model from directory:/repository
2023/11/03 10:26:40 ~ INFO | Using device GPU
2023/11/03 10:26:40 ~ Traceback (most recent call last):
2023/11/03 10:26:40 ~ File "/app/huggingface_inference_toolkit/utils.py", line 261, in get_pipeline
2023/11/03 10:26:40 ~ File "/app/huggingface_inference_toolkit/handler.py", line 17, in __init__
2023/11/03 10:26:40 ~ config_class = CONFIG_MAPPING[config_dict["model_type"]]
2023/11/03 10:26:40 ~ hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
2023/11/03 10:26:40 ~ File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 682, in startup
2023/11/03 10:26:40 ~ File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 705, in lifespan
2023/11/03 10:26:40 ~ File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 998, in from_pretrained
2023/11/03 10:26:40 ~ File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/__init__.py", line 705, in pipeline
2023/11/03 10:26:40 ~ async with self.lifespan_context(app) as maybe_state:
2023/11/03 10:26:40 ~ config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
2023/11/03 10:26:40 ~ self.pipeline = get_pipeline(model_dir=model_dir, task=task)
2023/11/03 10:26:40 ~ return HuggingFaceHandler(model_dir=model_dir, task=task)
2023/11/03 10:26:40 ~ File "/app/huggingface_inference_toolkit/handler.py", line 45, in get_inference_handler_either_custom_or_default_handler
2023/11/03 10:26:40 ~ inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
2023/11/03 10:26:40 ~ File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 710, in __getitem__
dg95d 2023-11-03T17:26:40.163Z
2023/11/03 10:26:40 ~ KeyError: 'llava'
2023/11/03 10:26:40 ~ raise KeyError(key)
2023/11/03 10:26:40 ~ File "/app/webservice_starlette.py", line 57, in some_startup_task
2023/11/03 10:26:40 ~ await handler()
2023/11/03 10:26:40 ~ File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 584, in __aenter__
2023/11/03 10:26:40 ~ await self._router.startup()
2023/11/03 10:26:40 ~ Application startup failed. Exiting.
1 Like

@pranavbadami , were you able to fix this?

Facing same issue, here, any way to fix it?