I would like to fine-tune a T5 model for sequence classification (specifically sentiment classification). However, all the tutorials are doing seq-2-seq analysis, such as text summarization as below. I understand why it uses ROUGE score for the cost calculation and it uses AutoModelForSeq2SeqLM package since it is seq-2-seq task. I think I need to change my cost function and also need to replace AutoModelForSeq2SeqLM by something? Any suggestions/ideas how to proceed?
You can check out this notebook to fine tune T5 on sequence classification (not in the classification way like encoder only models do but rather with conditional generation. In case you haven’t checked, I’d suggest you to use encoder only models with classification heads and see how they perform. It’s a simpler approach that would work better for classification IMHO.
Thanks @merve, I have tried BERT as an encoder only method, the accuracy is fine. However, I want to see if I can have something better with T5 fine-tuned.
@dammy how did your experiment with T5 fine-tuning on sentiment classification?