Hi All,
I am having some trouble understanding how to use conversational models with the HF inference api.
The documentation gives a curl example as:
curl https://api-inference.huggingface.co/models/microsoft/DialoGPT-large \
-X POST \
-d '{"inputs": {"past_user_inputs": ["Which movie is the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' \
-H "Authorization: Bearer ${HF_API_TOKEN}"
however running this command gives the output:
curl -s -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/microsoft/DialoGPT-large -X POST -d '{"inputs": {"past_user_inputs": ["Which movie is
the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' -H "Authorization: ${HUGGINGFACE_AUTH_HEADER}"
Failed to deserialize the JSON body into the target type: inputs: invalid type: map, expected a string at line 1 column 11
the docs state that three key value pairs are needed for the input:
inputs (required)
text (required) The last input from the user in the conversation.
generated_responses A list of strings corresponding to the earlier replies from the model.
past_user_inputs A list of strings corresponding to the earlier replies from the user. Should be of the same length of generated_responses.
However it seems that what is required is:
inputs = (some string)
I have tried the inputs = map approach with some of the other conversational models and they all reply saying that inputs must be a string and not a map.
curl -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/facebook/blenderbot-400M-distill -X POST -d '{"inputs": {"past_user_inputs": ["Which mov
ie is the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' -H "Authorization: ${HUGGINGFACE_AUTH_HEADER}"
{"error":" `args[0]`: {'past_user_inputs': ['Which movie is the best ?'], 'generated_responses': ['It is Die Hard for sure.'], 'text': 'Can you explain why ?'} have the wrong format. The
should be either of type `str` or type `list`","warnings":["There was an inference error: `args[0]`: {'past_user_inputs': ['Which movie is the best ?'], 'generated_responses': ['It is Di
e Hard for sure.'], 'text': 'Can you explain why ?'} have the wrong format. The should be either of type `str` or type `list`"]}
curl -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/meta-llama/Llama-2-7b-chat-hf -X POST -d '{"inputs": {"past_user_inputs": ["Which movie
is the best ?"], "generated_responses": ["It is Die Hard for sure."], "text":"Can you explain why ?"}}' -H "Authorization: ${HUGGINGFACE_AUTH_HEADER}"
Failed to deserialize the JSON body into the target type: inputs: invalid type: map, expected a string at line 1 column 11
can anyone explain to me how this inference api for conversational models is meant to work?
another source of confusion is if I go to a conversational models page in Huggingface, I can see the chat bot option, and if I initiate a conversation I can capture the POST requests to the model in the developer tools.
When I did this I noticed it uses a different endpoint to the one detailed in the inference API docs,
instead of:
https://api-inference.huggingface.co/models/microsoft/DialoGPT-large
it uses:
https://api-inference.huggingface.co/models/microsoft/DialoGPT-large/v1/chat/completions
and instead of a request structured like this:
{“inputs”: {“past_user_inputs”: [“Which movie is the best ?”], “generated_responses”: [“It is Die Hard for sure.”], “text”:“Can you explain why ?”}}
it uses one like this:
{“messages”:[{“role”:“user”,“content”:“Which movie is the best”},{“role”:“assistant”,“content”:“Boondock s sequel?”},{“role”:“user”,“content”:“Can you explain why ?”}],"
can anyone shed some light on what is the correct / official way to consumer a conversational model with the inference api