Facing issue using a model hosted on HuggingFace Server and talking to it using API_KEY

Shaleensr · May 11, 2025, 9:15am

I am trying to create a simple langchain app on text-generation using API to communicate with models on HuggingFace servers.

I created a “.env” file and stored by KEY in the variable: “HUGGINGFACEHUB_API_TOKEN”
I also checked it, API token is valid.

Post that, I tried running this code snippet:

    from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
    from dotenv import load_dotenv

    load_dotenv()

    llm = HuggingFaceEndpoint(
               repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
               task="text-generation"
    )

    model = ChatHuggingFace(llm=llm)
    result = model.invoke("What is the capital of India")
    print(result.content)

This is giving an error. I tried multiple things around it, but nothing worked.

Here is the error log:
Traceback (most recent call last):
File “C:\Users\SS\Desktop\Camp_langchain_models\2.ChatModels\2_chatmodel_hf_api.py”, line 13, in
result = model.invoke(“What is the capital of India”)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 370, in invoke
self.generate_prompt(
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 947, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 766, in generate
self._generate_with_cache(
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 1012, in _generate_with_cache
result = self._generate(
^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_huggingface\chat_models\huggingface.py”, line 574, in generate
answer = self.llm.client.chat_completion(messages=message_dicts, **params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\huggingface_hub\inference_client.py”, line 886, in chat_completion
provider_helper = get_provider_helper(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\huggingface_hub\inference_providers_init.py", line 165, in get_provider_helper
provider = next(iter(provider_mapping))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration

I am new to it. Any guidance around this is much appreciated. Thank you.

John6666 · May 11, 2025, 10:04am

I think LangChain has not yet caught up with the changes in Hugging Face’s specifications.

github.com/huggingface/huggingface_hub

API Request issue

opened 12:54PM - 31 Mar 25 UTC

surya7N

C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\redi…s\connection.py:77: UserWarning: redis-py works best with hiredis. Please consider installing warnings.warn(msg) Write Query Here: describe C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\huggingface_hub\utils\_deprecation.py:131: FutureWarning: 'post' (from 'huggingface_hub.inference._client') is deprecated and will be removed from version '0.31.0'. Making direct POST requests to the inference server is not supported anymore. Please use task methods instead (e.g. `InferenceClient.chat_completion`). If your use case is not supported, please open an issue in https://github.com/huggingface/huggingface_hub. warnings.warn(warning_message, FutureWarning) Traceback (most recent call last): File "C:\Users\Public\CHATBOT\llm_memory_with_Model.py", line 59, in <module> response=qa_chain.invoke({'query': user_query}) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 170, in invoke raise e File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 160, in invoke self._call(inputs, run_manager=run_manager) File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\retrieval_qa\base.py", line 154, in _call answer = self.combine_documents_chain.run( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\_api\deprecation.py", line 181, in warning_emitting_wrapper return wrapped(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 611, in run return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\_api\deprecation.py", line 181, in warning_emitting_wrapper return wrapped(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 389, in __call__ return self.invoke( ^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 170, in invoke raise e File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 160, in invoke self._call(inputs, run_manager=run_manager) File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\combine_documents\base.py", line 138, in _call output, extra_return_dict = self.combine_docs( ^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\combine_documents\stuff.py", line 259, in combine_docs return self.llm_chain.predict(callbacks=callbacks, **inputs), {} ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\llm.py", line 318, in predict return self(kwargs, callbacks=callbacks)[self.output_key] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\_api\deprecation.py", line 181, in warning_emitting_wrapper return wrapped(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 389, in __call__ return self.invoke( ^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 170, in invoke raise e File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py", line 160, in invoke self._call(inputs, run_manager=run_manager) File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\llm.py", line 126, in _call response = self.generate([inputs], run_manager=run_manager) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\llm.py", line 138, in generate return self.llm.generate_prompt( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\llms.py", line 763, in generate_prompt return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\llms.py", line 966, in generate output = self._generate_helper( ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\llms.py", line 787, in _generate_helper self._generate( File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\llms.py", line 1526, in _generate self._call(prompt, stop=stop, run_manager=run_manager, **kwargs) File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_huggingface\llms\huggingface_endpoint.py", line 312, in _call response = self.client.post( ^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\huggingface_hub\utils\_deprecation.py", line 132, in inner_f return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\huggingface_hub\inference\_client.py", line 302, in post mapped_model = provider_helper._prepare_mapped_model(model or self.model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\huggingface_hub\inference\_providers\hf_inference.py", line 35, in _prepare_mapped_model _check_supported_task(model_id, self.task) File "C:\Users\suboyina\AppData\Local\Programs\Python\Python312\Lib\site-packages\huggingface_hub\inference\_providers\hf_inference.py", line 156, in _check_supported_task raise ValueError( ValueError: Model 'mistralai/Mistral-7B-Instruct-v0.3' doesn't support task 'unknown'. Supported tasks: 'text-generation', got: 'unknown'

Meanwhile, one possible solution would be to downgrade your huggingface-hub version to 0.27.1 or below.

pip install huggingface_hub<=0.27.1

niteshburnwal · May 11, 2025, 3:13pm

I am also facing similar issue
please let me know if you found any solution

mahmutc · May 11, 2025, 4:04pm

pip install langchain-huggingface langchain

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
  repo_id="deepseek-ai/DeepSeek-R1",
  provider="together"
)
model = ChatHuggingFace(llm=llm)
result = model.invoke("What is the capital of India")

This works for me with the following setup:

$ pip freeze | grep huggingface
huggingface-hub==0.31.1
langchain-huggingface==0.2.0
$ pip freeze | grep langchain
langchain==0.3.25
langchain-core==0.3.59
langchain-huggingface==0.2.0
langchain-text-splitters==0.3.8

mahmutc · May 11, 2025, 4:11pm

Please note the following regarding TinyLlama/TinyLlama-1.1B-Chat-v1.0:

This model isn’t deployed by any Inference Provider.

Shaleensr · May 11, 2025, 4:25pm

Thank you @mahmutc. This code snippet worked for me.

Shaleensr · May 11, 2025, 4:28pm

The below snippet by mahmutc worked for me:

mahmutc:

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
  repo_id="deepseek-ai/DeepSeek-R1",
  provider="together"
)
model = ChatHuggingFace(llm=llm)
result = model.invoke("What is the capital of India")
```from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
  repo_id="deepseek-ai/DeepSeek-R1",
  provider="together"
)
model = ChatHuggingFace(llm=llm)
result = model.invoke("What is the capital of India")

system · May 12, 2025, 4:28am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
HuggingFace Infernece Endpoint don't work Beginners	2	29	July 13, 2025
huggingface_hub.utils._errors.HfHubHTTPError: 404 Client Error: Not Found for url: Models	1	62	July 1, 2025
I'm having an error message working with my User access tokens Inference Endpoints on the Hub	19	13196	June 6, 2025
Python HF Not Working Beginners	1	35	July 1, 2025
I am getting this error on langchain Beginners	15	1386	May 25, 2025

Facing issue using a model hosted on HuggingFace Server and talking to it using API_KEY

Related topics