Train large models on large datasets by parts


It is not possible to train large xlm-roberta model on large datasets for token classification due to GPU memory using

But what if I split my dataset into parts? Is it okay for script? I just think that result will be worse.