I’d like to train BERT from stratch on my custom corpus for the Masked Language Modeling task. But corpus has one specific - it is sequence of the numbers and absolute value of the difference of two words corresponds to its proximity. Therefore I guess I should use this difference(or some similar) as loss function during training. Is it possible to use custom loss function training BERT model fo ML task?
You can compute the loss outside of your model since it returns the logits, and apply any function you like.
If you question was related to the
Trainer, you should definte your subclass with a
compute_loss method. There is an example in the documentation (scroll a bit down).
in the link you attached above, I have a question related to the example.
why do we need this line of code when computing loss?
loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
I mean the .view() method. why do we have to reshape logits tensors?