Hello
Here you can find a neat tutorial on how to use Hugging Face models with TF Serving. As you guessed, instances are your examples you want your model to infer.
batch = tokenizer(sentence)
batch = dict(batch)
batch = [batch]
input_data = {"instances": batch}
Your payload input works just fine in Inference API btw. My guess is you could put your inputs to instances part and it would work just fine. (maybe try as a list if it doesn’t) Something like:
batch = [{"inputs": {
"past_user_inputs": ["Which movie is the best ?"],
"generated_responses": ["It's Die Hard for sure."],
"text": "Can you explain why ?"
}}]
input_data = {"instances": batch}
Let me know if it doesn’t work.