ELECTRA training reimplementation and discussion

RichardWang · September 25, 2020, 12:06am

I don’t know whether gumbel-softmax can be for text generation or not, but there is the paper.
As for implementation, create an dist = torch.distributions.gumbel.Gumbel(0.,1.) and add gumbel noise to the output logits logits = T5(...)[0] and new_logits = logits + self.gumbel_dist.sample(logits.shape). You could also see my code.

Topic		Replies	Views
How pretrain ELECTRA on custom dataset? Beginners	5	4145	September 6, 2020
ELECTRA: Accounting for mask tokens that are correctly predicted by MLM 🤗Transformers	9	1283	May 15, 2021
ELECTRA Paper Doubts Research	0	219	September 8, 2023
Retrain Electra model with different embedings from scratch Beginners	0	317	April 1, 2022
Electra-base returns always same output 🤗Transformers	0	226	February 5, 2023

ELECTRA training reimplementation and discussion

Related topics