xlm-Roberta for mlm doesn't predict single one trained sentence properly

WpythonW · June 29, 2023, 9:58pm

I am trying to fine-tune an XLM-RoBERTa model for masked language modeling on a dataset of 9000 lemmatized sentences. I am using the XLMRobertaForMaskedLM class from the Hugging Face library and training the model with a batch size of 8. However, even after training for 3 epochs, the model is not performing well on the training data. I have tried training the model on a single sentence for 50 epochs, and while the loss decreases to 10^-10, the model still doesn’t predict the masked tokens correctly. I’m a beginner. Please help me understand why the model cannot learn to clearly predict even a single sentence after 50 training epochs

Topic		Replies	Views
Does XLM-R follows RoBERTa or XLM for MLM? Models	0	402	June 13, 2022
[URGENT] Issues with Training RoBERTa Model for Text Prediction with Fill Mask Task 🤗Transformers	6	216	March 19, 2024
Fine-tuned MLM based RoBERTa not improving performance Research	2	947	April 20, 2023
Incremental training on unlabeled data using MLM 🤗Transformers	0	632	December 10, 2022
Issue with XLM-RoBERTa tokenizer 🤗Tokenizers	1	301	August 15, 2024

xlm-Roberta for mlm doesn't predict single one trained sentence properly

Related topics