Huggingface datasets - reset/restart state

How can I “restart” a dataset, or reset it’s enumerator state, so I can reuse it again?

Datasets have no state: you can start iterating a dataset, stop, and restart iterating and it will start from the beginning.

1 Like

Update from 2025 for streaming datasets (IterableDataset): you can checkpoint and resume the iteration, see the docs here

1 Like