BertForSequenceClassification can be used for regression when number of classes is set to 1. The documentation says that BertForSequenceClassification calculates cross-entropy loss for classification. What kind of loss does it return for regression?
(I’ve been assuming it is root mean square error, but I read recently that there are several other possibilities such as Huber or Negative Log Likelihood.)
At line 1354, you have the condition to check the labels (if it is one or more) if self.num_labels == 1:
# We are doing regression
loss_fct = MSELoss()
loss = loss_fct(logits.view(-1), labels.view(-1)) else:
loss_fct = CrossEntropyLoss()
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))