How to slice an already loaded Dataset?

Using Datasets version ‘2.6.1’, how do I slice a take a Dataset object and slice it (for example only the first 100 examples) and get a Dataset?

dd = load_dataset is a DatasetDict.
dd['train'] is a Dataset.

If I dd['train'][:100], I get a dict not a Dataset object anymore. Also I can’t create a new Dataset with datasets.Dataset(dd['train'][:100]).

I have seen some examples in the forum of a method called take.

But Dataset.take does not seem to exist anymore.

I guess this is for an old version of Dataset, how to slice an already loaded Dataset?

Got it. The method now is select

Yes, select is the correct answer.

Btw, Dataset.take was never a part of the API, but we may add it eventually to be consistent with the IterableDataset, which is returned in the streaming mode.

2 Likes