How does the API inference work on models such as Blenderbot?

Weilin · January 26, 2021, 7:32pm

I assume models like blenderbot need to look at prior inputs and outputs in order to form some consistency. How does the inference API provide that to the model?

lnetanel · February 1, 2021, 2:03pm

Hey, I’m dealing with the same subject.

As far as I understand, there is a way to provide the context of the previous text in the conversation.
The details are here:
https://api-inference.huggingface.co/docs/python/html/detailed_parameters.html#conversational-task

Although, when I tried it on the 1B model, I’m getting the following error:
“Cutting history off because it’s too long (36 > 28) for underlying model”

I don’t know if this is a limitation of the model or the API.
If you find a solution for that I let me know.

soni69 · May 12, 2022, 10:59am

Me too. I attempt to use the ‘template’ from conversational task / microsoft/dialoGPT to provide past_user_inputs and generated_responses, but it is not working.

Some guidance on this will be appreciated

soni69 · May 12, 2022, 11:11am

It works. Thanks

sandersbud4 · May 14, 2022, 7:53am

Yeah, It is working.

Topic		Replies	Views
Running blenderbot-3B locally does not produce same results as with inference API Beginners	2	443	March 28, 2022
Trouble Invoking GPU-Accelerated Inference Beginners	5	1463	October 3, 2022
What is the correct format conversational models in the inference api Beginners	0	176	July 31, 2024
Inference API works for flan-t5-xxl, but not for many other models I have tried with Jupyter/VSCode 🤗Transformers	0	373	June 15, 2023
Dumb Question: Seeing that my inference API links not working Beginners	1	72	July 10, 2025

How does the API inference work on models such as Blenderbot?

Related topics