I am trying to finetune t5-large for summarization on xsum. I followed all the steps like:
- max_input_length=512,
- max_target_length=64,
- batch_size=2,
- AdaFactor instead of AdamW,
- XLA_USE_BF16=true,
- initialized all models and datasets in the global scope instead of _mp_fn.
I want to know what I am doing wrong. I am using Kaggle and 16gb of RAM should be sufficient for this according to T5 Finetuning tips.
Please guide me