How does the API inference work on models such as Blenderbot?

I assume models like blenderbot need to look at prior inputs and outputs in order to form some consistency. How does the inference API provide that to the model?

Hey, I’m dealing with the same subject.

As far as I understand, there is a way to provide the context of the previous text in the conversation.
The details are here:
https://api-inference.huggingface.co/docs/python/html/detailed_parameters.html#conversational-task

Although, when I tried it on the 1B model, I’m getting the following error:
“Cutting history off because it’s too long (36 > 28) for underlying model”

I don’t know if this is a limitation of the model or the API.
If you find a solution for that I let me know.

Me too. I attempt to use the ‘template’ from conversational task / microsoft/dialoGPT to provide past_user_inputs and generated_responses, but it is not working.

Some guidance on this will be appreciated

It works. Thanks

Yeah, It is working.