Have specific examples in electra/BERT not back propagate

I’m running a 2 class classification NER task with a transformer, and I would like to have specific words in a segment not back propagate. For example, take the sentence, “the dog ran”, and say I don’t want “ran” to be back propagated. If I could use one hot encoded vectors with a specific loss function, I could simply have the corresponding label vectors to “ran” be [0,0]. But PyTorch does not support one hot encoded vectors. According to one of the software engineers at Pytorch, I could create an additional class index, which can be ignored in the loss calculation by specifying ignore_index in the criterion, and set the desired words to this index. But I don’t believe there is any ignore_index parameter for the transformer modules.

So is there another way to achieve this?