T5 Finetuning Tips

IMAYK · November 10, 2022, 6:20pm

I’m finetuning t5 large for text2sql using a batch size of 2, and gradient accumulation steps to 600. I’m training it on RTX A6000.
Currently, it is showing ~1700/it. Is this normal? If not, how should I proceed?
I’m using the finetuning code from here and made changes to the data pre-processing steps only.

Topic		Replies	Views
Issue with finetuning a seq-to-seq model 🤗Transformers	30	4056	August 11, 2022
Finetuning mT5 for specific language pair Models	0	184	October 17, 2024
T5 fp16 issue is fixed 🤗Transformers	18	15411	June 20, 2024
Training T5 on mlm task from scratch 🤗Transformers	4	3318	July 29, 2022
mT5/T5v1.1 Fine-Tuning Results Models	16	7612	March 8, 2022

T5 Finetuning Tips

Related topics