Hi @sundaravel, you can check the source code for BertForSequenceClassification
here. It also has code for regression problem.
Specifically for regression your last layer will be of shape (hidden_size, 1) and use MSE loss instead of cross entropy
Hi @sundaravel, you can check the source code for BertForSequenceClassification
here. It also has code for regression problem.
Specifically for regression your last layer will be of shape (hidden_size, 1) and use MSE loss instead of cross entropy