How to fine-tune Bert on STS-B task?

Hi, I am new to NLP and trying to reproduce fine-tune results of Bert. However, the STS-B task troubles me, from what I understand, the STS-B task is a regression task, but Bert treats it as a classification task. I do not quite know the transformation between scores and labels in detail, is anybody willing to give me a hint?

This is all dealt with in the loss function: a model that is tasked with classification or regression is the same roughly, it just outputs a different number of labels. Inside the code of BertModelForSequenceClassification, you can see there is a test that picks a different loss function depending on the problem_type, and by default 1 label (like in STS-B) corresponds to a regression, so the mean-squared error is selected as a loss, instead of cross-entropy.

Thank you for your detailed reply, it saves me :wink: