For multi-task learning T5 uses temperature scaled mixing. Does this uses 100% examples of all tasks and we will have some duplicates examples?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Can we fine-tune T5 for multiple tasks? | 0 | 616 | January 24, 2023 | |
T5 finetuning metrics not improving | 1 | 339 | June 20, 2023 | |
Retrain T5 using unsupervised learning with MLM | 0 | 244 | May 21, 2023 | |
Finetuning T5 on translation task | 0 | 484 | September 10, 2021 | |
How is T5 pretrained? | 3 | 495 | July 12, 2021 |