I wanted to do some experiments with additional pretraining on a multilingual roberta. I prefer using Flax because I have a working setup for RoBERTa.
I noticed there are no Flax model classes for xlm-roberta. However, it also say that the implementation is similar to roberta. Can I pretrain xlm-roberta in Flax using Roberta model classes?