Arabic to French Word embedding Using skip-gram needs new Ideas in the data part

So, we were going to do the same thing as in the paper which is linked bellow but Arabic to French using skip gram model see details in the slides linked

But today the doctor say he prefer each team add for the project

Some suggestion noted from the doctor

1- In External evaluation find new External evaluation idea

2-More preprocessing steps
Making sure each column has it’s language in its rows
in other words, Arabic columns have Arabic sentences
French columns have French columns
3- Data augmentation
You have 2 million data Arabic to French
Train model from Arabic to French

Then translate from French to Arabic using you model then you will have 20 million words let’s say

Then train you model 20 million Arabic and target French
Where the most of the Arabic data is new while the target is the human french data

The paper name is Back translation Who did this thing
the tool called back 7 or something I forgot

He also said most ideas in the data which from what I understand he is more leaning on us adding idea in data part

Slides link:

Paper link: ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model
Slides link: Our project slides

If you can give us new ideas we will very much appreciate
Thank you,

1 Like