Finetune fill-mask network

Hello,
I’m trying to finetune a xlm-roberta-base for a multilingual fill-mask task.

Is there any code (for example in this repo) that I can refer to?

Is there a better network than xlm-roberta-base for a finetuning fille-mask task (as a T5 model) ?

Thanks.