Actually this brings up another question. The code above creates features that look like his:
{'image_id': Value(dtype='int64', id=None),
'image': Image(decode=True, id=None),
'height': Value(dtype='int32', id=None),
'width': Value(dtype='int32', id=None),
'objects': {'bbox': Sequence(feature=Sequence(feature=Value(dtype='float32', id=None), length=4, id=None), length=-1, id=None),
'categories': Sequence(feature=ClassLabel(names=['Thing0'], id=None), length=-1, id=None)}}
Reflecting the metadata.jsonl strcuture (I also included image_id, height, and width)
Now when looking at the popular cppe-5 dataset the structure looks a bit different:
from datasets import load_dataset
cppe5 = load_dataset("cppe-5")
cppe5["train"].features
shows
{'image_id': Value(dtype='int64', id=None),
'image': Image(decode=True, id=None),
'width': Value(dtype='int32', id=None),
'height': Value(dtype='int32', id=None),
'objects': Sequence(feature={'id': Value(dtype='int64', id=None), 'area': Value(dtype='int64', id=None), 'bbox': Sequence(feature=Value(dtype='float32', id=None), length=4, id=None), 'category': ClassLabel(names=['Coverall', 'Face_Shield', 'Gloves', 'Goggles', 'Mask'], id=None)}, length=-1, id=None)}
so “objects” itself is a sequence, instead of bbox beeing a sequence(sequence(…)).
Is that the way how it should look like? Does that make a difference? And how would I achieve the same structure as in the cppe-5 dataset?