I am trying to fine-tune BART for text infilling task, for example, I want my model learn “Steve Jobs is founder of Apple” from “Steve Jobs [MASK] Apple”.
My questions are mainly the following three:
(1) BartModel and BartForConditionalGeneration, which one should I choose?
(2) Can you provide examples of how to use the corresponding API?
(3) How to compute the ‘loss’ of text infilling task?
Hi, I’m curious about a couple of things: 1) did you get this model running well, and 2) would this model also work for more standard “next-token” causal LM?
EDIT: Oh, also, would it be expected to do both MLM and CLM? That is, with an input like: “The chicken [MASK] to get” could it continue and output something like “The chicken crossed the road to get to the other side”?? That would be pretty much ideal for my use-case. And if so, how would one go about doing this?
(Sorry for hijacking your thread, but I’ve been wondering about something like this for a while!)