Currently, there is no ELECTRA or ELECTRA Large model that was trained from scratch for Portuguese on the hub: Hugging Face – The AI community building the future.. For this project, the goal is to create a ELECTRA model for just the Portuguese language.
Model
A randomly initialized ELECTRA model
Available training scripts
A masked language modeling script for Flax is available here. It can be used pretty much without any required code changes.
(Optional) Desired project outcome
The desired project output is a strong ELECTRA model in Portuguese.
ELECTRA actually uses a special pretraining script, which would need to be written during the sprint - if you are very motivated I’m willing to confirm this “single-person” project and give you a TPU VM. Please leave a comment on the official googe sheet in this case: Confirmed teams for Flax/JAX community week - Google Sheets
Hi Patrick, I am also interested in training Electra for Portuguese. I don’t know if this is currently under development or already has been done, if not, I would like to implement this pretraining script and train the model for Portuguese.