Howto train translation model WITHOUT VALIDATION data?

sancelot · November 8, 2023, 8:12am

Hi,

I trained a m2m100 model against technical / mechanical corpus in 6 languages from existing translations texts .
The training data will permit the model to understand the new words of this technical world.
But unfortunately at this time I have no evaluation data.

I will have evaluations data, only when using the trained model and translating new texts. A human will read these auto translated texts and will indicate if translation is right or not. (and so this may be my validation data)

So, I have some questions :
1/ how to evaluate blue score in this context ?
2/ It is a bit complicated to knwo if I enough trained my models (how to adjust n_epochs ?)
3/ What do you advice in my usecase ?

Topic		Replies	Views
M2M100 training does not improve model performance 🤗Transformers	0	302	September 29, 2022
AutoTrain models performance (mainly F1 score) 🤗AutoTrain	7	1579	January 3, 2023
How can I train M2M-100 or NLLB-200 on my parallel bilingual corpus? 🤗Transformers	0	781	September 22, 2022
Evaluating pretrained model Beginners	0	308	July 26, 2021
Technical clarification on the validation data vs. the training data in the trainer API 🤗Transformers	1	752	January 6, 2022

Howto train translation model WITHOUT VALIDATION data?

Related topics