Great part is that T5 performs really well with and without prefix Here’s what I observed in my experiments.,
- It converges slightly faster when using a task prefix and when the task was similar, say summarization
- Performed equally well even without prefix, took slightly longer to converge