Finetuning T5 on translation task

pere · September 10, 2021, 9:57am

I am finetuning T5 on a translation task. I am using Flax.

The translation task is between two fairly similar dialects.

Watching how the quality of the translation progresses, I see that is starts with predicting “”. Then it calculates the loss between and , and then slowly after several iterations gets to something that is fairly similar to , and then starts improving after finally ending up with a translation. Even if I end up with something that is qualitatively decent (and a BLEU-score of above 80), it still seems slow and “unstable”.

Since this is a seq2seq model, I guess it already has a fairly good way of doing source->encode/decode->source, and then calculating loss based on [encoded-decoded] source vs target, instead of calculating the loss based on vs .

I am fairly new to these models. Does this make sense?

Topic		Replies	Views
T5 Finetuning Tips Models	48	56683	November 3, 2024
T5-small parameter finetuning translation task Models	0	623	June 29, 2022
Can we fine-tune T5 for multiple tasks? 🤗Transformers	0	630	January 24, 2023
Pretrain T5 from scratch in Dutch Flax/JAX Projects	2	2093	July 7, 2021
Decoder only fine-tuning enough for UMT5 Models	0	336	November 29, 2023

Finetuning T5 on translation task

Related topics