Use load dataset to load a sample of the dataset

Hello :wave:,

I would really love to load a sample of the dataset rather than the whole data at first. Can I do this with hugging face library. I don’t want to download the full dataset as it is 23GB large rather just download a sample and work on it asap before working on the whole dataset.

Any ideas ??

As far as I know, this is something that is actively being worked on.

@lhoestq might be able to provide more info :slight_smile:


Thanks for the info. Such a time saver.

This, changes everything. Can’t wait @lhoestq