How to prevent Huggingface Trainer from reaching out of memory?

I use Huggingface Trainer to train a BERT model for sequence classification (on a CPU, no CUDA).
I use Jupyter to run my code.

I also make use of Huggingface Dataset and as far as I understand by default it reads the CSV file with a memory mapped file so no problem there.

When training, Jupyter shows me higher and higher values of memory usage until the kernel eventually dies.

How can I prevent the kernel from dying while training?