OOM Error when trying to save t5-large using save_strategy="epoch"

prikmm · August 5, 2021, 8:21am

I am trying to finetune t5-large for summarization on xsum. I followed all the steps like:

max_input_length=512,
max_target_length=64,
batch_size=2,
AdaFactor instead of AdamW,
XLA_USE_BF16=true,
initialized all models and datasets in the global scope instead of _mp_fn.

I want to know what I am doing wrong. I am using Kaggle and 16gb of RAM should be sufficient for this according to T5 Finetuning tips.
Please guide me

@valhalla @sgugger @sshleifer

Topic		Replies	Views
T5 model for summarization far from SOTA results Models	0	1344	July 2, 2021
Gradient overflow when fine-tune t5 on CNN/DM dataset Beginners	5	1684	September 3, 2020
How many steps or epochs to finetune T5-small/base/large on XSum? 🤗Transformers	0	1402	August 7, 2021
Finetuning T5 for Summarisation - Poor results Intermediate	1	532	April 28, 2024
Finetuning T5 model on Xsum dataset Beginners	0	326	September 22, 2021

OOM Error when trying to save t5-large using save_strategy="epoch"

Related topics