ByT5 for text classification


Facing a problem whilst trying to finetune the ByT5 model for the text classification task.

I’ve tried to use this notebook exploring-T5/t5_fine_tuning.ipynb at master · patil-suraj/exploring-T5 · GitHub by @valhalla for ByT5 finetuning on the text classification task. When I start from ‘google/byt5-small’ I get really strange results. A model always generates negative sentiment label (‘n’ in my code)

When I switch to t5-small pretrained checkpoint the results are reasonable

What am I missing here? Any piece of advice would be much appreciated!

1 Like

Hello @janyfe . Could you find a solution for this? I am also facing same issue with multi-class text classification task. Among 6 classes, byt5 always predicts 1 class.

Hello @Chandrai! No, unfortunately, I wasn’t able to find the solution and gave up using byt5 model.