Retrain Electra model with different embedings from scratch

perry123 · April 1, 2022, 2:30pm

Hello all,

I have implemented Electra model with a different (ELMO-based) embedings. I have based my code on BERT version from GitHub - helboukkouri/character-bert: Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters". However, I have an issue with training. If I train pure Electra from scratch, all is working correctly and loss is decreasing.
However, when I try to train the new version, loss is stucked and oscilate (the same happend also with original BERT). I have tried to train ELMO separatelly with a different network and it seems to be working as it should.
Any idea how to debug this, or how to find what is wrong?

Thank you all

Topic		Replies	Views
How could different model be trained same way? Beginners	0	207	July 26, 2021
Can I train ELECTRA from scratch using hugging face? Models	0	210	January 31, 2024
Using bert tokenizer in Electra model 🤗Transformers	0	352	September 27, 2021
SpanBERT, ELECTRA, MARGE from scratch? Beginners	5	1380	July 22, 2023
ELMO Character encoder layer Beginners	1	519	May 17, 2021

Retrain Electra model with different embedings from scratch

Related topics