T5 Fine Tuning - Text to Text Generation

rohankhrn56 · April 7, 2021, 10:45am

I was working on an interesting problem of generating inferences from the excel data. I wrote a python program to generate rules from the data in the form of RDF Triple and now training using T5-Base model. with some 10k training data of rdf rules and inferences I was able to get some 80% to 85% test accuracy. I’m using ADAMW optimizer with lr of 1e-5.
One issue I have seen is the model is not able to generalize well on new numbers. eg. if I pass a rule of “Critical” | priority_ticketshare | “23.09%” to the model, it returns inference as Critical priority tickets accounted for 22.09% of total tickets. While the statement is correct, the number it has taken is wrong. Any idea how to solve this

jominmathew · April 7, 2021, 3:12pm

You are expecting a model to behave like a database :-). It will never do that buddy. Not even GPT-3. Use t5-small for less memory footprints. I gurantee you, there won’t be much drop n accuracy and you get better inference time.

rohankhrn56 · April 7, 2021, 4:11pm

I will definitely try with t5-small. I got this thought from WEBNLG2017 dataset where many of the training samples were numerical relations like “ Aargas airport | runwayslength | 24.5” and where T5-base showed a sota results. Hence, thought of applying same concept on reporting automations.

Topic		Replies	Views
Automated Analytics Inference Generation Beginners	0	260	February 19, 2021
Metrics for Text Generation from T5 Model Beginners	3	870	November 1, 2023
Model training problem Beginners	1	24	September 6, 2024
Need help in fine-tuning T5-Base Model for a sequence task Beginners	0	168	May 8, 2024
Fine-tune T5-small but lower performance Models	0	1407	April 21, 2022

T5 Fine Tuning - Text to Text Generation

Related topics