How to assign category names to a dataset created with ImageFolder?

I created an image dataset (for object identification) following the ImageFolder approach. The created dataset only has the class ids, not their names anywhere.

The documentation on load_dataset doesn’t explain well how I would add this (or any other extra metadata).

Hi! You can define labels in the load_dataset call as follows:

import datasets

class_names = [...]
features = datasets.Features(
    {
        "image": datasets.Image(),
        "objects": {
            "bbox": datasets.Sequence(datasets.Value("float32"), length=4), 
            "categories": datasets.ClassLabel(names=class_names)
        }
    }
)
ds = datasets.load_dataset(..., features=features)
1 Like

Hi! Thanks for the input! I found i small mistake there. The bbox and the categories are actually arrays. So the code should look like:

features = datasets.Features(
    {
        "image": datasets.Image(),
        "objects": {
            "bbox": datasets.Sequence(datasets.Sequence(datasets.Value("float32"), length=4)),
            "categories": datasets.Sequence(datasets.ClassLabel(names=class_names))
        }
    }
)
1 Like

Actually this brings up another question. The code above creates features that look like his:

{'image_id': Value(dtype='int64', id=None),
 'image': Image(decode=True, id=None),
 'height': Value(dtype='int32', id=None),
 'width': Value(dtype='int32', id=None),
 'objects': {'bbox': Sequence(feature=Sequence(feature=Value(dtype='float32', id=None), length=4, id=None), length=-1, id=None),
  'categories': Sequence(feature=ClassLabel(names=['Thing0'], id=None), length=-1, id=None)}}

Reflecting the metadata.jsonl strcuture (I also included image_id, height, and width)

Now when looking at the popular cppe-5 dataset the structure looks a bit different:

from datasets import load_dataset

cppe5 = load_dataset("cppe-5")
cppe5["train"].features

shows

{'image_id': Value(dtype='int64', id=None),
 'image': Image(decode=True, id=None),
 'width': Value(dtype='int32', id=None),
 'height': Value(dtype='int32', id=None),
 'objects': Sequence(feature={'id': Value(dtype='int64', id=None), 'area': Value(dtype='int64', id=None), 'bbox': Sequence(feature=Value(dtype='float32', id=None), length=4, id=None), 'category': ClassLabel(names=['Coverall', 'Face_Shield', 'Gloves', 'Goggles', 'Mask'], id=None)}, length=-1, id=None)}

so “objects” itself is a sequence, instead of bbox beeing a sequence(sequence(…)).

Is that the way how it should look like? Does that make a difference? And how would I achieve the same structure as in the cppe-5 dataset?

Those two ways of defining features are actually equivalent !

Indeed for legacy reasons from Tensorflow Datasets, the Sequence(Value/ClassLabel/...) type is a list in general but a Sequence({...}) is a dictionary of lists.

So these two are equivalent:

'objects': {
  'bbox': Sequence(Sequence(Value('float32'), length=4)),
  'categories': Sequence(ClassLabel(names=['Thing0']))}
}
'objects': Sequence({
  'bbox': Sequence(Value('float32'), length=4),
  'category': ClassLabel(names=['Thing0'])
})

Please use the first way though, since the second one is confusing IMO

1 Like