Facing issue using a model hosted on HuggingFace Server and talking to it using API_KEY

I am trying to create a simple langchain app on text-generation using API to communicate with models on HuggingFace servers.

I created a “.env” file and stored by KEY in the variable: “HUGGINGFACEHUB_API_TOKEN”
I also checked it, API token is valid.

Post that, I tried running this code snippet:

    from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
    from dotenv import load_dotenv

    load_dotenv()

    llm = HuggingFaceEndpoint(
               repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
               task="text-generation"
    )

    model = ChatHuggingFace(llm=llm)
    result = model.invoke("What is the capital of India")
    print(result.content)

This is giving an error. I tried multiple things around it, but nothing worked.

Here is the error log:
Traceback (most recent call last):
File “C:\Users\SS\Desktop\Camp_langchain_models\2.ChatModels\2_chatmodel_hf_api.py”, line 13, in
result = model.invoke(“What is the capital of India”)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 370, in invoke
self.generate_prompt(
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 947, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 766, in generate
self._generate_with_cache(
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py”, line 1012, in _generate_with_cache
result = self._generate(
^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\langchain_huggingface\chat_models\huggingface.py”, line 574, in generate
answer = self.llm.client.chat_completion(messages=message_dicts, **params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\huggingface_hub\inference_client.py”, line 886, in chat_completion
provider_helper = get_provider_helper(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\SS\Desktop\Camp_langchain_models\venv\Lib\site-packages\huggingface_hub\inference_providers_init
.py", line 165, in get_provider_helper
provider = next(iter(provider_mapping))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration

I am new to it. Any guidance around this is much appreciated. Thank you.

1 Like

I think LangChain has not yet caught up with the changes in Hugging Face’s specifications.

Meanwhile, one possible solution would be to downgrade your huggingface-hub version to 0.27.1 or below.

pip install huggingface_hub<=0.27.1
1 Like

I am also facing similar issue
please let me know if you found any solution

1 Like

pip install langchain-huggingface langchain

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
  repo_id="deepseek-ai/DeepSeek-R1",
  provider="together"
)
model = ChatHuggingFace(llm=llm)
result = model.invoke("What is the capital of India")

This works for me with the following setup:

$ pip freeze | grep huggingface
huggingface-hub==0.31.1
langchain-huggingface==0.2.0
$ pip freeze | grep langchain
langchain==0.3.25
langchain-core==0.3.59
langchain-huggingface==0.2.0
langchain-text-splitters==0.3.8
1 Like

Please note the following regarding TinyLlama/TinyLlama-1.1B-Chat-v1.0:

This model isn’t deployed by any Inference Provider.

2 Likes

Thank you @mahmutc. This code snippet worked for me.

1 Like

The below snippet by mahmutc worked for me:

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.