I want to train a T5 like model but I want to output different pieces of information from the same document representation created by the encoder. When doing prediction I would feed partial outputs to the decoder (name, age, etc.) and start autoregressive generation from there. But how to train such a model?
Some examples:
T5:
Input: “name: I am George. I am 34 years old”. Output: “George”.
Input: “age: I am George. I am 34 years old”. Output: “34”.
Inverse T5:
Input: “I am George. I am 34 years old”. Possible outputs: “name: George”, “age: 34”.
Inverse T5 as a training dataset would look something like this:
Input: “I am George. I am 34 years old”. Output: “name: George”.
Input: “I am George. I am 34 years old”. Output: “age: 34”.
Trivial solution would be to just train on such a dataset and only feed partial outputs ("name: ", "age: ") when predicting. I have doubts though that this would lead to good results.
A better solution would be to also feed prefixes to the decoder at training time. Can I do that?