Arabic Question Generation using Shared AraBERT2AraBERT isn't working

Lara15 · July 15, 2023, 4:50pm

I am working on Arabic question generation using arabert_base, and mMarco dataset. I am following BERT2BERT for CNN/Dailymail notebook, and the training notebook on Arabic Empathetic Chatbot repo.

The problem is that (rouge, bleu, meteor) metrics all zeros, and the generated output is [CLS] [CLS] [CLS] [CLS] [CLS] [CLS] [CLS] [CLS] [CLS] repeated CLS token until the sentence reaches the maximum length. I am training on small subset to check if the model works fine before the full training.

I want to know if the small training set(10000 sample) is responsible for the problem, or the preprocessing method, or anything else.
please check my notebook

Topic	Replies	Views
Fine-tuning BERT for Machine Translation Models	725	May 21, 2022
BERT2BERT for CNN/Dailymail example not working 🤗Transformers	228	September 8, 2022
BERT2RND EncoderDecoderModel predicts random words for Translation tasks 🤗Transformers	379	May 30, 2022
Adding a Decoder to the Model AraBERT Models	155	July 27, 2023
Bert2Bert Translation task Models	1089	August 24, 2022

Arabic Question Generation using Shared AraBERT2AraBERT isn't working

Related topics