Casual LM on GLUE dataset

ndvb · September 2, 2023, 2:38pm

I am able to benchmark masked LM, like bert, deberta, roberta, on Glue dataset using the run_glue.py provided in transformers.
However, when I try to change the model to a Casual LM model, the training does not progress well, the training loss immediately goes to zero.
I suspect there is a difference in the loss function. Can this be done?

Topic		Replies	Views
Replicating RoBERTa-base GLUE results Models	0	867	June 18, 2022
Bert LM pretraining: training loss goes to 0 at masking probability of 0.999 Beginners	2	2319	October 31, 2020
How to run GLUE on my own fine-tuned model Beginners	0	280	June 29, 2022
MNLI Inference on a fine-tuned model from hub Beginners	3	303	February 25, 2021
`run_glue.py` with my own dataset of one-sentence input 🤗Transformers	6	7398	July 18, 2021

Casual LM on GLUE dataset

Related topics