Wav2vec2-xls-r-2b out of memory issues on A100 (40 GB)

I am trying to finetune wav2vec2-xls-r-2b model on some common voice dataset but it is giving me a memory error. I even tried lowering the batch size (2 or 4) but it gives me the same error. I have 8 A100 GPUs, even if I specify 3 or 4 of them to use it gives me the same error. Also, I fine-tuned xlsr-53, xls-r-300M and xls-r-1B models with batch size 64 on the same dataset, it worked without any out-of-memory issues. Here is the error:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 39.59 GiB total capacity; 35.68 GiB already allocated; 6.19 MiB free; 37.51 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How to solve these memory issues? Help would be really appreciated.