Masked language modeling loss

This thread might help you