Inference Endpoint Deployment Error

RichardLu · February 6, 2025, 9:15am

i have put the key and value in secret evn. I have also been granted acces to the base model as well . But i still get this error

[Server message]Endpoint failed to start
See details
Exit code: 3. Reason: ile "/app/huggingface_inference_toolkit/handler.py", line 22, in __init__
    self.pipeline = get_pipeline(
                    ^^^^^^^^^^^^^
  File "/app/huggingface_inference_toolkit/utils.py", line 252, in get_pipeline
    hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/pipelines/__init__.py", line 849, in pipeline
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/configuration_auto.py", line 1054, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/configuration_utils.py", line 591, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/configuration_utils.py", line 650, in _get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/utils/hub.py", line 421, in cached_file
    raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Llama-2-7b-hf.
401 Client Error. (Request ID: Root=1-67a47ce3-3bf42d400bdb981152276ec0;1523df3a-9f78-4109-843b-c4bdb4270ee8)

Cannot access gated repo for url https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/config.json.
Access to model meta-llama/Llama-2-7b-hf is restricted. You must have access to it and be authenticated to access it. Please log in.

Application startup failed. Exiting.

https://huggingface.co/RichardLu/Llama2_7B_ABSA_Lap14

Above link is the model i am trying to deploy. I have also created custom handler as well. Please help

John6666 · February 7, 2025, 8:26am

The error content is clearly related to the token. Since you can access the base model during training, it seems that the token is not being passed correctly or the wrong token is being passed when using the endpoint.

The most forceful and reliable method is to pass it directly to the function, and the next best method is to use login().

system · February 23, 2025, 6:14am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Key Error when trying to deploy inference endpoint Inference Endpoints on the Hub	2	787	December 3, 2023
Fail to deploy newer models Inference Endpoints on the Hub	4	198	February 5, 2025
Server message:Endpoint failed to start Inference Endpoints on the Hub	3	609	June 26, 2024
Inference Issue with Llama Models using HF Inference Beginners	1	30	February 6, 2025
Inference Endpoint Fails to Start Inference Endpoints on the Hub	16	3570	February 9, 2024

Inference Endpoint Deployment Error

Related topics