Which loss function in bertforsequenceclassification regression

BertForSequenceClassification can be used for regression when number of classes is set to 1. The documentation says that BertForSequenceClassification calculates cross-entropy loss for classification. What kind of loss does it return for regression?

(I’ve been assuming it is root mean square error, but I read recently that there are several other possibilities such as Huber or Negative Log Likelihood.)

Which is it?

How should I find out / where is the code?

1 Like

This is the GitHub link

At line 1354, you have the condition to check the labels (if it is one or more)
if self.num_labels == 1:
# We are doing regression
loss_fct = MSELoss()
loss = loss_fct(logits.view(-1), labels.view(-1))
else:
loss_fct = CrossEntropyLoss()
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))

4 Likes

You can select the lines that you are interested in on a Github code page, then click on the three dots and select copy permalink:

1 Like

Thanks @BramVanroy, this makes the code easier to read. I will do so, in future.

Thank you !

Can you add parameters to the loss function through transformer? for example, add weights to each of the classes?

Hi theudster,

the pytorch docs for CrossEntropyLoss suggest that you can add a weight tensor CrossEntropyLoss — PyTorch 1.7.1 documentation . What happens if you try it?

I haven’t tried that because I am trying to implement everything through the Trainer method