The Blenderbot paper describes a “pretrained generative” model trained on the pushshift.io Reddit
dataset. This model is also referred to as “pushshift.io Reddit Generative”.
They go on to fine-tune this model on the “BST tasks” (ConvAI2, Empathetic Dialogues, Wizard of Wikipedia, Blended Skill Talk) and call the resulting model “BST Generative”.
Are the checkpoints facebook/blenderbot-xxx
the “pretrained generative” model or “BST Generative” model?