Correct numeric labels for classification?


This is a simple question but better safe than sorry! My understanding is that the transformers class of models (for text classification) can only deal with integer labels as classes.

So it’s up to the user to provide a mapping between labels and scores. In the usual example one could have 0 = negative, 1 = neutral, 2 = positive.

Here is the basic question: do the numeric scores necessarily need to be integers from 0 to N (the number of classes) or I can use any other numbers of my liking? :sweat_smile:


yes, I can confirm the labels have to be integers starting at zero. I still wonder what is the mathematical reason for that? Any ideas @nielsr by any chance?