Inverse T5 with output (instead of input) prefix

I want to train a T5 like model but I want to output different pieces of information from the same document representation created by the encoder. When doing prediction I would feed partial outputs to the decoder (name, age, etc.) and start autoregressive generation from there. But how to train such a model?

Some examples:

T5:
Input: “name: I am George. I am 34 years old”. Output: “George”.
Input: “age: I am George. I am 34 years old”. Output: “34”.

Inverse T5:
Input: “I am George. I am 34 years old”. Possible outputs: “name: George”, “age: 34”.

Inverse T5 as a training dataset would look something like this:
Input: “I am George. I am 34 years old”. Output: “name: George”.
Input: “I am George. I am 34 years old”. Output: “age: 34”.

Trivial solution would be to just train on such a dataset and only feed partial outputs ("name: ", "age: ") when predicting. I have doubts though that this would lead to good results.

A better solution would be to also feed prefixes to the decoder at training time. Can I do that?

1 Like

Hi @marton-avrios

Your first input format might work, and in prediction instead of feeding partial output you could ask the model to generate whole text. Passing partial output is also possible.

I had applied T5 to a somewhat similar problem, and it gave surprisingly good results.

1 Like

That’s good to hear! But I am not sure I understand it. If I do not feed partial outputs at prediction time how do I control which piece of information I want to extract from the document? Naively I would think in this case the output will always be the most frequent type of information in the dataset (age, name or whatever). So for example it will always be “name: George” because some documents in the training set only contained name but no age thus name is the most likely field to generate.