ELECTRA: Accounting for mask tokens that are correctly predicted by MLM

Hi @rsvarma !

FYI, my ELECTRA reimplementation has successfully replicated the results in the paper.

:computer:Code: https://github.com/richarddwang/electra_pytorch
:newspaper:Post: ELECTRA training reimplementation and discussion

3 Likes