RoBERT model for Sinhala Language

Hi
I am tring to build RoBERT model for Sinhala language

My final Training data set is as follows

no of words = 64 129 561

no of sentences = 5 134 347

Size of the file = 938,019 KB

I already created the Bert Tokenizer using my Training dataset. (Size of the tokenizer is 1644 KB + 1299 KB)

Now Im trying to Train the model using google colab

since the dataset is large I divided the dataset into 10 subsets.

but it seems that for 1epoch the training time is really large around 270h ( But the remaining time seems to decrease quickly, within 20mins it dropped down from 290h to 250h).

Sometimes the program gets crashed after 10 or 20 mins

I created the model by referring the following link. (I am using exact same code, this code execute in about 3h for 1 epoch)

Is this happens because the created tokenizer is big ?

Is there a better way to do this?

Reference code