Pre-trained DeBERTa


I wanted to use DeBERTa.
Somehow the preview of its unmasking abilities seems very bad.

I looked at the source code and cannot see the addition of the absolute positions.

Can someone explain me why the model performs so bad at MLM preview.
Maybe I overlooked the addition of the absolute positions in the source code.
An explanation of the implementation would be really helpful aswell!

Thank you!