Classification problem difficulty when going from 3 classes to 5 classes?

facehugger2020 · January 3, 2021, 11:55pm

This question is conceptual in nature.

Suppose I’m working on a text classification problem where I have 3 labels. To make the problem more concrete, let’s say I’m working on sentiment analysis with ground-truth labels positive, neutral, and negative. I am measuring accuracy and macro-F1.

Now I’d like to make another data set with 5 ground-truth labels: very positive, positive, neutral, negative, and very negative. Intuitively, I would think that the 5-label classification problem is more difficult than the 3-label problem, but the only “proof” I can think of is that a random guess is correct only 1/5 of the time with 5 labels but a random guess is correct 1/3 of the time with 3 labels.

Is there a more formal machine learning argument for why a 5-label problem is more difficult than 3-label? How about an N-label problem to an M-label problem where M > N?

I’m willing to brush up on Vapnik–Chervonenkis theory if that’s needed (hopefully not).

facehugger2020 · January 11, 2021, 8:22pm

Any help, intuition, hints, pointers, or references would be appreciated.

Topic		Replies	Views
Finetuning T5 for multi class classification Intermediate	0	947	January 6, 2022
ByT5 for text classification Models	3	1159	July 21, 2024
For multi-class text classification, what's the maximum number of labels allowed? 🤗AutoTrain	0	1348	December 17, 2021
T5 for classification task 🤗Transformers	0	486	April 25, 2023
How to add more labels for prediction in our pre existing model 🤗Transformers	0	264	February 8, 2023

Classification problem difficulty when going from 3 classes to 5 classes?

Related topics