Training RAM memory issues


I’m trying to finetune the W2v2 model on my own dataset (100K audio) merged with the French Common Voice dataset (total of 89Go) but the preprocess is killed after 44%. I tried to remove the lengthiest audio and it works (<=3s) but what I am not understanding is that I have 128G of RAM it should be enough no?

Can I load data on the fly while batches loading? instead of loading all the dataset before launching the training phase?

Thank’s for your reply