Create dataset with metadata containing the strings of the categories of bboxes

Hi!

I am trying to create a dataset based on an ImageFolder and a metadata.jsonl file. From here I know that the metadata must look like this:

{“file_name”: “0001.png”, “objects”: {“bbox”: [[302.0, 109.0, 73.0, 52.0]], “categories”: [0]}}

When using the ImageFolder usually the label is inferred automatically from the source folder of the image and converted to a number. The string and the mapping is also stored in the dataset. How does this work with bboxes? Where would I put the “categories” as human readable strings so they are automatically included in the dataset when I load it?