I was trying out running AutoModelForSequenceClassification for num_labels = 1 and for num_labels =2 but no matter what changes I did, the Trainer module kept throwing some or the other unfixable error.
In num_labels=1, it worked but the model learnt absolutely nothing with the regression loss. I think there should be a BinaryCrossEntropyLoss there instead of the MSELoss because it just doesn’t do a good job in training the model.
For num_labels=2, unless you turn on the multi-label-problem=True, it doesn’t work because the trainer keeps asking for targets of size [*, 2] without expanding the labels into one-hot.
Now, this could be a bug or there might be a specific way to make it work.BCEwithLogitsLoss is a much better option than MSELoss and I can’t think of the reason why someone would use MSELoss.