Uploading image dataset to Huggingface Hub

Hi, I am trying to create an image dataset (training only) and upload it on HuggingFace Hub. The data has two columns: 1) the image, and 2) the description text, aka, label. Essentially I’m trying to upload something similar like this.

I am using the ImageFolder approach and have my data folder structured as such:

metadata.jsonl
data/train/image_1.png
data/train/image_2.png
data/train/image_3.png
data/train/image_4.png
...

In the metadata.jsonl file I’ve added the labels for the images as mentioned here:

{β€œfile_name”: β€œimage_1.png”, β€œtext”: β€œsome description about image 1”}
{β€œfile_name”: β€œimage_2.png”, β€œtext”: β€œsome description about image 2”}
{β€œfile_name”: β€œimage_3.png”, β€œtext”: β€œsome description about image 3”}
{β€œfile_name”: β€œimage_4.png”, β€œtext”: β€œsome description about image 4”}

My script to upload is simple, and looks something like this:

from datasets import load_dataset
dataset = load_dataset(β€œimagefolder”, data_dir=β€œdata”, split=β€œtrain”)
dataset.push_to_hub(β€œejcho623/undraw-raw”)

When I run the script, strangely enough it seems to only push 1 image? I see a data/train-xxxx.parquet and a dataset_infos.json file generated on my repo, but clearly (given the size) it has not uploaded the full dataset in my local directory (1000+ images). Here is the result of the command

ejcho@ejs-macbook-pro undraw-raw % python3 push_images.py
Using custom data configuration default-e70837628f6a2c62
Found cached dataset imagefolder (/Users/ejcho/.cache/huggingface/datasets/imagefolder/default-e70837628f6a2c62/0.0.0/37fbb85cc714a338bea574ac6c7d0b5be5aff46c1862c1989b20e0771199e93f)
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1/1 [00:00<00:00, 68.35ba/s]
Pushing dataset shards to the dataset hub: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1/1 [00:02<00:00, 2.27s/it]

Would anyone be able to help to see what’s going on?

Thanks