PreTrain RoBERTa from scratch in Hindi

Hey @mlkorra, @skylord, @Mrinal, please join this discord Flax-HuggingFace-Community-Week, you should be directly added to the #roberta-pretraining-hindi channel.

Added you guys!

Hi, I’d like to join this team as well, if there’s space available. I’m a newbie and will probably need some guidance/directions.

1 Like

I’m so interested to join this team.
Could you please add me?

@patrickvonplaten @valhalla I am working on something similar , is it possible if I can have a look at the approach used for pre-training the Wordpiece / Sentence piece tokenizer training for hindi ?
Or did you start with Multilingual bert tokenizer ?