How pretrain ELECTRA on custom dataset?

FeryET · September 2, 2020, 6:28pm

I have a 1GB raw text dataset in a niche domain. I want to train an ELECTRA model but I couldn’t find any tutorial/examples to do so. Can anyone help me? I tried using the simpletransformers package, but it has memory issues at the moment and after a few epochs my colab session crashes.

valhalla · September 3, 2020, 4:15am

pinging @lysandre

lysandre · September 3, 2020, 7:31am

Do you want to fine-tune or pre-train an ELECTRA model? If you want to fine-tune it, you can leverage the examples/run_language_modeling.py script.

If you want to pre-train it, your best bet is to use the original implementation (in TF1) and then convert it to our library using our conversion script which is here.

Here’s a PR for pre-training ELECTRA with our library but it’s not working right now and we don’t have the bandwidth to get back into it right now.

FeryET · September 3, 2020, 8:36am

@lysandre I am planning to pretrain the model, but I would like to also give a shot to examples/run_language_modeling.py script. How can I finetune the base Electra model on MLM task on my data? The script is a bit ambigious.

valhalla · September 6, 2020, 7:50am

I’m not sure if fine-tuning ELECTRA with MLM is good idea since the main idea behind ELECTRA was to train it as a discriminator rather than generator and overcome the problems of MLM’s

Any thoughts @lysandre

valhalla · September 6, 2020, 10:53am

Topic		Replies	Views
ELECTRA training reimplementation and discussion Research	14	6716	September 17, 2023
How hard is it to finetune an ELECTRA model for multihead regression & classification? 🤗Transformers	0	312	October 15, 2021
PreTrain ELECTRA from scratch in Portuguese Flax/JAX Projects	2	1332	November 8, 2021
ELECTRA: Accounting for mask tokens that are correctly predicted by MLM 🤗Transformers	9	1286	May 15, 2021
Electra NER on Conll03-english Beginners	6	931	September 9, 2020

How pretrain ELECTRA on custom dataset?

Related topics