What is the correct format conversational models in the inference api

steve-b · July 31, 2024, 5:21pm

Hi All,
I am having some trouble understanding how to use conversational models with the HF inference api.

The documentation gives a curl example as:

curl https://api-inference.huggingface.co/models/microsoft/DialoGPT-large \
        -X POST \
        -d '{"inputs": {"past_user_inputs": ["Which movie is the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' \
        -H "Authorization: Bearer ${HF_API_TOKEN}"

however running this command gives the output:

curl -s -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/microsoft/DialoGPT-large -X POST -d '{"inputs": {"past_user_inputs": ["Which movie is
 the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' -H "Authorization: ${HUGGINGFACE_AUTH_HEADER}"

Failed to deserialize the JSON body into the target type: inputs: invalid type: map, expected a string at line 1 column 11

the docs state that three key value pairs are needed for the input:

inputs (required) 	
text (required) 	The last input from the user in the conversation.
generated_responses 	A list of strings corresponding to the earlier replies from the model.
past_user_inputs 	A list of strings corresponding to the earlier replies from the user. Should be of the same length of generated_responses.

However it seems that what is required is:
inputs = (some string)

I have tried the inputs = map approach with some of the other conversational models and they all reply saying that inputs must be a string and not a map.

curl -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/facebook/blenderbot-400M-distill -X POST -d '{"inputs": {"past_user_inputs": ["Which mov
ie is the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' -H "Authorization: ${HUGGINGFACE_AUTH_HEADER}"
{"error":" `args[0]`: {'past_user_inputs': ['Which movie is the best ?'], 'generated_responses': ['It is Die Hard for sure.'], 'text': 'Can you explain why ?'} have the wrong format. The
should be either of type `str` or type `list`","warnings":["There was an inference error:  `args[0]`: {'past_user_inputs': ['Which movie is the best ?'], 'generated_responses': ['It is Di
e Hard for sure.'], 'text': 'Can you explain why ?'} have the wrong format. The should be either of type `str` or type `list`"]}

 curl -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/meta-llama/Llama-2-7b-chat-hf -X POST -d '{"inputs": {"past_user_inputs": ["Which movie
is the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' -H "Authorization: ${HUGGINGFACE_AUTH_HEADER}"
Failed to deserialize the JSON body into the target type: inputs: invalid type: map, expected a string at line 1 column 11

can anyone explain to me how this inference api for conversational models is meant to work?

another source of confusion is if I go to a conversational models page in Huggingface, I can see the chat bot option, and if I initiate a conversation I can capture the POST requests to the model in the developer tools.

When I did this I noticed it uses a different endpoint to the one detailed in the inference API docs,
instead of:
https://api-inference.huggingface.co/models/microsoft/DialoGPT-large
it uses:
https://api-inference.huggingface.co/models/microsoft/DialoGPT-large/v1/chat/completions

and instead of a request structured like this:
{“inputs”: {“past_user_inputs”: [“Which movie is the best ?”], “generated_responses”: [“It is Die Hard for sure.”], “text”:“Can you explain why ?”}}

it uses one like this:
{“messages”:[{“role”:“user”,“content”:“Which movie is the best”},{“role”:“assistant”,“content”:“Boondock s sequel?”},{“role”:“user”,“content”:“Can you explain why ?”}],"

can anyone shed some light on what is the correct / official way to consumer a conversational model with the inference api

Topic		Replies	Views
T5 question answering inference Beginners	1	399	April 11, 2025
How to Pass the Conversation as Input in the Mistral Instruct Inference API Inference Endpoints on the Hub	3	2883	October 12, 2023
Error invoking DialoGPT-large via serverless inference endpoint - can only concatenate str (not "dict") to str" Inference Endpoints on the Hub	3	947	March 14, 2024
Formatting Inference API call for LLama 2 Inference Endpoints on the Hub	3	11787	November 23, 2023
Json format in Postman for Inference API Beginners	0	701	August 6, 2021

What is the correct format conversational models in the inference api

Related topics