Hi! I have a Graph to Text generation problem where I have a paragraph of sentences as groundtruth and for each sentence I have graph which is linearized. I combined the graphs for all sentences by further linearizing.
para: mix flour salt and carron seeds together. mix in oil with the flour. add lemon juice and water and knead to form a dough. add the fennel seeds coriander seeds salt mango powder and garam masala to the potatoes. mix the sugar cilantro ginger green chili and lime juice to the potatoes. add cumin seeds asafoetida and green peas to the oil in pan. add the potato mixture to the pan and stir. roll the dough out into a thin oval. cut the oval in half and seal the edges together. fill the dough with the filling. seal the samosa shut.
linearized graph: translate Graph to English: :graph0 ( mix (:ARG1 flour:ARG1 salt:ARG1 carron:ARG1 seeds:ARGM-MNR together ) ):graph1 ( mix (:ARG1 oil:ARG2 with:ARG2 the:ARG2 flour ) ):graph2 ( add (:ARG1 lemon:ARG1 juice:ARG1 water:ARGM-PRP to:ARGM-PRP form:ARGM-PRP a:ARGM-PRP dough ) ):graph3 ( add (:ARG1 fennel:ARG1 seeds:ARG1 coriander:ARG1 seeds:ARG1 salt:ARG1 mango:ARG1 powder:ARG1 garam:ARG1 masala:ARG2 to:ARG2 the:ARG2 potatoes ) ):graph4 ( mix (:ARG1 sugar:ARG1 cilantro:ARG1 ginger:ARG1 chili:ARG1 lime:ARG1 juice:ARG2 to:ARG2 the:ARG2 potatoes ) ):graph5 ( add (:ARG1 cumin:ARG1 seeds:ARG1 asafoetida:ARG1 peas (:Attr green ):ARG2 to:ARG2 the:ARG2 oil:ARG2 in:ARG2 pan ) ):graph6 ( add (:ARG1 potato:ARG1 mixture:ARG2 to:ARG2 the:ARG2 pan ) ):graph7 ( roll (:ARG1 dough :ARGM-PRD into :ARGM-PRD a :ARGM-PRD thin :ARGM-PRD oval ) ):graph8 ( cut (:ARG1 oval:ARG2 in:ARG2 half ) seal (:ARG1 edges :ARGM-PRD together ) ):graph9 ( fill (:ARG1 dough:ARG2 with:ARG2 the:ARG2 filling ) ):graph10 ( seal (:ARG1 samosa :ARGM-PRD shut ) )
Output by model on testing: mix flour salt carron seeds together. mix oil with the flour. add lemon
Problem: My generated output comes out to be very small after training my dataset on the pre-trained t5 small model. My dataset has a total of 1714 paragraphs. Even if I train on 1600 of them, the length of the output is very small by the model. Is the problem that my dataset is too small or is it that I need to change some parameter so that it generates more lengthy outputs.
It works great when I try to do only one sentence at a time but I also have to do the Graph to Text for paragraphs too. I’m also using the ‘translate Graph to English:’ prefix for each linearized graph.
Please provide suggestions.