How to Pass the Conversation as Input in the Mistral Instruct Inference API

Here is the curl command which only accepts one input but I need to pass a chat conversation as input in a structure like following -

Human: {past-message}
AI: {past-reply}
Human: {new-message}

Curl command for mistral instruct -

curl https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.1 \
	-X POST \
	-d '{"inputs": "Can you please let us know more details about your "}' \
	-H 'Content-Type: application/json' \
	-H "Authorization: Bearer xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Is it possible?

hi @namitgupta ,

If you want to run a chat conversation, you need to build your input string following a chat template for the model.

Here is a simple JavaScript snippet for you, I’m sure you can adapt to use with curl

fetch(
  "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.1",
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: "Bearer xxxxxx",
    },
    body: JSON.stringify({
      inputs:
        `<s>[INST] What is your favourite condiment? [/INST]
My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.</s>
[INST] And what do you think about it? [/INST]`,
      parameters: { max_new_tokens: 500 },
    }),
  }
)
  .then((response) => response.json())
  .then((data) => console.log(data[0].generated_text));

You can see a full Python example using our InferenceClient

Thanks @radames. This works but in response I am getting the prompt as well in the generated text. How to fix this?

[{"generated_text":"<s>[INST] What is your favourite condiment? [/INST]\nMy favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.</s>\n[INST] And what do you think about it? [/INST] I think ketchup is a great condiment. It adds a nice tangy flavor to many dishes and can be used in a variety of ways, such as on burgers, fries, and eggs. It's also a popular choice for many people, so it's always nice to have it on hand."}]

sorry, yes you can add return_full_text to params

parameters: { max_new_tokens: 100 , return_full_text: false},

You can see all available parameters here

1 Like