CUDA is out of memory

Hi

I finetune xml-roberta-large according to this tutorial. I met a problem that during training colab CUDA is out of memory.

RuntimeError: CUDA out of memory. Tried to allocate 978.00 MiB (GPU 0; 14.76 GiB total capacity; 12.62 GiB already allocated; 919.75 MiB free; 12.83 GiB reserved in total by PyTorch

And it is given that batch_size = 1

I tried to do that on xml-roberta-base, training lasts longer but over all ends up with the same problem. I tried bert-base-uncase that is in tutorial and it’s okay. But my data is multillingual!

I want to understand is it true that this problem is just because of natural limits of Colab or it is my fault. Is it possible to finetune xml roberta large in Colab?

Thanks!

Hi @Constantin, it’s possible that you’re getting allocated one of the K80 GPUs on Colab which probably doesn’t have enough RAM to handle xlm-roberta-large.

You can “cheat” you way to a better GPU (either Tesla T4 or P100) by selecting Runtime > Factory reset runtime in the settings:

You can check what kind of GPU your notebook is running by executing the following in a code cell:

!nvidia-smi
1 Like

You could also try kaggle. It’s very similar and gives you a P100 (I think they give you more memory as well).

2 Likes

Tried using kaggle’s P100, but I am getting the same error, i.e.

OutOfMemoryError: CUDA out of memory. Tried to allocate 384.00 MiB (GPU 0; 15.90 GiB total capacity; 14.70 GiB already allocated; 245.75 MiB free; 14.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Any advice on using another model for sequence classification task rather than xlm?