Selective masking in Language modeling

Hi Huggingfacers

I have a number of questions regarding finetuning a language model:

  1. How to mask a selective portion of a given input sentence instead of masking randomly.
  2. For example, if I am using ALBERT as a model, and I am aiming to do a different kind of loss function than the standard MLM loss for the masked tokens, how to access the model output of the masked tokens

Refer to: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb#scrollTo=M1oqh0F6W3ad

Mask Code:

1 Like