I’m not sure I understand what is the difference between what you are describing and how the GLUE dataset is handled in the T5 paper
1 Like
I’m not sure I understand what is the difference between what you are describing and how the GLUE dataset is handled in the T5 paper