I’m using Dataset.from_generator to build large datasets.
Assuming the builder writes incrementally to disk as the dataset is constructed, is there a way to automatically resume in case of an error that disrupts construction?
I’m using Dataset.from_generator to build large datasets.
Assuming the builder writes incrementally to disk as the dataset is constructed, is there a way to automatically resume in case of an error that disrupts construction?
Hi ! It’s not currently possible
Maybe you can create multiple Dataset objects, this way if one crashes the others can continue
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.