Creating a tfds.load compatible dataset

Hi everyone,

What I’m actually trying to do is quite simple but seems poorly documented.
All I want is to create an image dataset that I will be able to import through tfds, using tfds.load().

I first tried with the most simple scenario, which is importing the existing cats_vs_dogs dataset
and loading it with:

train_ds, validation_ds, test_ds = tfds.load(
    "huggingface:microsoft__cats_vs_dogs",
    split=["train[:40%]", "train[40%:50%]", "train[50%:60%]"],
    as_supervised=True,  # Include labels
)

And it worked, as expected.

I then tried to import this dataset on my personal space with simple piece of code:

from datasets import load_dataset

dataset = load_dataset("microsoft/cats_vs_dogs")
dataset.push_to_hub("my_username/cats_vs_dogs")

Which resulted in a parquet dataset, as I could expect.

My surprise was to find out that this didn’t work:

train_ds, validation_ds, test_ds = tfds.load(
    "huggingface:my_username__cats_vs_dogs",
    split=["train[:40%]", "train[40%:50%]", "train[50%:60%]"],
    as_supervised=True,  # Include labels
)

I end up an error such as: dataset doesn’t support supervised learning

Also tried to create a dataset from ImageFolder, but I got a different error from pillow: “Impossible to parse ByteIO image” (kinda) when importing the dataset with tfds.

What am I missing there ?
Any help is appreciated