Train large models on large datasets by parts

Hi

It is not possible to train large xlm-roberta model on large datasets for token classification due to GPU memory using run_ner.py

But what if I split my dataset into parts? Is it okay for run_ner.py script? I just think that result will be worse.