Data format for BertForSequenceClassification with num_labels > 2

Hi @maxpower, I think the format of your dataset is fine but I think you have to change the model’s loss function to use a sigmoid instead of a softmax on the logits (i.e. BCEWithLogitsLoss). You can see a skeleton + hacky Colab in this thread: Fine-Tune for MultiClass or MultiLabel-MultiClass - #8 by lewtun

2 Likes