I’m finetuning t5 large for text2sql using a batch size of 2, and gradient accumulation steps to 600. I’m training it on RTX A6000.
Currently, it is showing ~1700/it. Is this normal? If not, how should I proceed?
I’m using the finetuning code from here and made changes to the data pre-processing steps only.
1 Like