Loading just part of dataset

Some datasets are huge, which makes it impractical to load all of it from Hf with load_dataset(), when debugging the code. Therefore one needs just load part of the dataset, say the first 10k rows. But how?

I know it is possible to load a part of dataset to memory with “slice splitting”, but it appears that it first downloads the whole dataset if it is not cached.

You can stream the dataset which doesn’t download anything, and lets you use it instantly :slight_smile:

1 Like

Thank you very much, That’s it. Great!