I am training T5-base to extract short strings (customer complaints) from texts in a semi-extractive manner (its extractive up-to-the lemma) .
Some texts have several complaints, what is the best way to input the labels? I was suggested to do it as follow
<extra_id_0> Uhv under range<extra_id_1> Vacuum not come on<extra_id_2> Valve clogging<extra_id_3>
Without changing the original text input.
But it seems that in almost all of the cases the model generates only one label, or if I get two labels it is when the original text is “there is a Uhv under range and Vacuum not come on” i,e the model find “the” is a signal for multiple labels…