Fail to deploy newer models

noanem · January 31, 2025, 12:12pm

Hi,

I am trying to deploy Qwen2.5-VL-7B-Instruct and Deepseek-VL2-tiny to an instance of GPU · Nvidia L40S · 1x GPU · 48 GB.
This is what I get (pretty much the same for both):

[Server message]Endpoint failed to start
See details
Exit code: 3. Reason: py", line 569, in __aenter__
    await self._router.startup()
  File "/usr/local/lib/python3.11/dist-packages/starlette/routing.py", line 670, in startup
    await handler()
  File "/app/webservice_starlette.py", line 62, in prepare_model_artifacts
    inference_handler = get_inference_handler_either_custom_or_default_handler(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/huggingface_inference_toolkit/handler.py", line 159, in get_inference_handler_either_custom_or_default_handler
    return HuggingFaceHandler(model_dir=model_dir, task=task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/huggingface_inference_toolkit/handler.py", line 22, in __init__
    self.pipeline = get_pipeline(
                    ^^^^^^^^^^^^^
  File "/app/huggingface_inference_toolkit/utils.py", line 252, in get_pipeline
    hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/__init__.py", line 849, in pipeline
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/configuration_auto.py", line 1073, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `deepseek_vl_v2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`

Application startup failed. Exiting.

Then I try to use a custom container:

huggingface/transformers-all-latest-gpu:latest

And this is what I get:

[Server message]Endpoint failed to start
See details
Exit code: 0. Reason: application does not seem to be initialized

What am I doing wrong?
Any tips would be highly appreciated.

John6666 · January 31, 2025, 1:28pm

It looks like the version of the Transformers library is too old to recognize DeepSeek, but can users upgrade the Transformers in the Endpoint API…?

This?

noanem · January 31, 2025, 1:56pm

My understanding is that the default container has the version of Transformers that is too old to run those models.
This should be solved by using a custom container with newer version.
The only problem is that it doesn’t work for some reason.

John6666 · January 31, 2025, 3:30pm

This is the version from last March… this is old.
If the custom container doesn’t work, I guess users will have to do something with the requiements.txt.

transformers==4.48.2

Default

transformers[sklearn,sentencepiece,audio,vision]: 4.38.2

meganariley · February 5, 2025, 5:34pm

Hi @noanem Have you deleted the endpoint? We can take a closer look.

You may already know this, but we have a catalog with some ready-to-deploy models that don’t require additional customization. Here are some of the available models: Inference Catalog | Inference Endpoints by Hugging Face that might work for your use case.

Topic		Replies	Views
Qwen failed to build on Huggingface Inference Endpoint Models	0	237	February 23, 2024
How to install latest version of transformers on Inference Endpoint dedicated server? Intermediate	0	131	February 22, 2024
Key Error when trying to deploy inference endpoint Inference Endpoints on the Hub	2	792	December 3, 2023
Deploying T5-style models via Sagemaker Endpoint: 'T5LayerFF' object has no attribute 'config' Amazon SageMaker	5	1466	November 7, 2022
Model won't load on custom inference endpoint Inference Endpoints on the Hub	2	362	June 13, 2024

Fail to deploy newer models

Default

Related topics