My task is to generate a list of options and a story, given an intro.
an option is a sentence starting with a special token â<|option|>â.
an instance is something like âintro <|endofintro|> <|option|> option1 <|option|> option2 <|endofoption> story < endofstory>â
I need a dynamic attention mask because
(1) As I donât want the following options to attention to the previous options, I should mask all the previous options. For example, I could set the attention mask on all the options pos to 0.
(2) However, as the story should attention to all the options, I should keep the attention mask on all the options pos to 1
I could simply implement that during generation by generating one option per time, and after generating options, I could modify the attention mask to generate the story.
But I donât know how to do that during training.