Inverse T5 with output (instead of input) prefix

marton-avrios · January 5, 2021, 10:25am

I want to train a T5 like model but I want to output different pieces of information from the same document representation created by the encoder. When doing prediction I would feed partial outputs to the decoder (name, age, etc.) and start autoregressive generation from there. But how to train such a model?

Some examples:

T5:
Input: “name: I am George. I am 34 years old”. Output: “George”.
Input: “age: I am George. I am 34 years old”. Output: “34”.

Inverse T5:
Input: “I am George. I am 34 years old”. Possible outputs: “name: George”, “age: 34”.

Inverse T5 as a training dataset would look something like this:
Input: “I am George. I am 34 years old”. Output: “name: George”.
Input: “I am George. I am 34 years old”. Output: “age: 34”.

Trivial solution would be to just train on such a dataset and only feed partial outputs ("name: ", "age: ") when predicting. I have doubts though that this would lead to good results.

A better solution would be to also feed prefixes to the decoder at training time. Can I do that?

valhalla · January 6, 2021, 9:04am

Hi @marton-avrios

Your first input format might work, and in prediction instead of feeding partial output you could ask the model to generate whole text. Passing partial output is also possible.

I had applied T5 to a somewhat similar problem, and it gave surprisingly good results.

marton-avrios · January 6, 2021, 9:34am

That’s good to hear! But I am not sure I understand it. If I do not feed partial outputs at prediction time how do I control which piece of information I want to extract from the document? Naively I would think in this case the output will always be the most frequent type of information in the dataset (age, name or whatever). So for example it will always be “name: George” because some documents in the training set only contained name but no age thus name is the most likely field to generate.

Topic		Replies	Views
Finetuning T5 for a task Intermediate	21	6974	September 3, 2022
About Transformer task prefix Beginners	0	834	May 4, 2021
T5/BART decoder prefix 🤗Transformers	0	622	June 9, 2022
T5-small trained with small dataset not infering anything 🤗Transformers	0	211	April 25, 2023
T5 omits some characters 🤗Transformers	1	121	March 12, 2024

Inverse T5 with output (instead of input) prefix

Related topics