Dataset preview doesn't working: "The split does not contain any rows."

Hi.

I created a dataset repository for one version of the CUB multimodal dataset, but the data preview doesn’t work. The dataset is downloaded and loaded correctly.

Any help would be great!

It seems like streaming mode is not working on your dataset, ie: ds = load_dataset(dataset, split='train') works, but not ds = load_dataset(dataset, split='train', streaming=True).

And the dataset viewer is using the streaming mode to get the first 100 rows (more on this in the docs: Preview a dataset)

Maybe @albertvillanova or @lhoestq could give more hints on how to make the dataset streamable?

1 Like

As @severo points out, the current implementation does not support streaming mode.

We have opened an issue in the Community tab of the corresponding dataset repository. Please follow the thread there: alkzar90/CC6204-Hackaton-Cub-Dataset · Issue with dataset viewer

3 Likes

After the PR, now the streaming mode is working.

ds = load_dataset("alkzar90/CC6204-Hackaton-Cub-Dataset",
                  streaming=True)

Thanks for the support, every time I learn a bit more about using the dataset library with custom script!

1 Like