Hello,
I’m using a custom tokenizer with a Roberta model.
I want to pre-train the model using the masking task.
Is there a way to mask a span of tokens instead of randomly selecting which to mask?
I know that there is an option for whole word masking collator, but it is incompatible with my tokenizer and I also want more flexibility in the sequence length selection.
Thanks!