I want to train a model using over 40G of text data, about 40 million lines. Both load and map took longer when using map. Preprocessing_num_workers is set to 60.
I want to train a model using over 40G of text data, about 40 million lines. Both load and map took longer when using map. Preprocessing_num_workers is set to 60.