Hi everyone,
What I’m actually trying to do is quite simple but seems poorly documented.
All I want is to create an image dataset that I will be able to import through tfds, using tfds.load()
.
I first tried with the most simple scenario, which is importing the existing cats_vs_dogs dataset
and loading it with:
train_ds, validation_ds, test_ds = tfds.load(
"huggingface:microsoft__cats_vs_dogs",
split=["train[:40%]", "train[40%:50%]", "train[50%:60%]"],
as_supervised=True, # Include labels
)
And it worked, as expected.
I then tried to import this dataset on my personal space with simple piece of code:
from datasets import load_dataset
dataset = load_dataset("microsoft/cats_vs_dogs")
dataset.push_to_hub("my_username/cats_vs_dogs")
Which resulted in a parquet dataset, as I could expect.
My surprise was to find out that this didn’t work:
train_ds, validation_ds, test_ds = tfds.load(
"huggingface:my_username__cats_vs_dogs",
split=["train[:40%]", "train[40%:50%]", "train[50%:60%]"],
as_supervised=True, # Include labels
)
I end up an error such as: dataset doesn’t support supervised learning …
Also tried to create a dataset from ImageFolder, but I got a different error from pillow: “Impossible to parse ByteIO image” (kinda) when importing the dataset with tfds.
What am I missing there ?
Any help is appreciated