How do I do multi Class (multi head) classification?

This is all in the loss function, so you can definitely use BertForSequenceClassification with two labels, then use the proper loss function (probably BCEWithLogitsLoss).

1 Like