Running mT5 on multiple GPUs

Hi everyone,
I am trying to finetune the Google mT5 XXL model for MTNT French-English data. I use 2 32GB GPUs and I am using accelerate package for the model finetuning. However, I am getting a “CUDA out of memory” error for both GPUs. It appears that the script first runs on GPU 0 and then tries running on GPU 1 and therefore give the memory error for both GPUs.
Thank you.