HF Dataset to COCO format dataset


Before I roll my own, figured I’d ask… maybe I just didn’t find it…

Let’s say I have an Object Detection kind of dataset in HF hub that follows the DatasetDict format like the fashionpedia dataset.

Are there dataset functions that will convert entries from these to the COCO-format ?

I saw the discussion (topic: 34894) about YOLO → DETR/COCO, but would be nice to keep the data in HF format and then transform the entries to YOLO or COCO or “other” as needed.


No, we don’t have a function for converting to the COCO format in the API

:+1: Thanks @mariosasko

Did you do it and can share?

Hi @roy650 .

Take a look at the CocordiaisDataset class in this file.

And how to use it is in here somewhere in the first couple of cells.

I haven’t used these in a while, so I don’t remember if there was anything un-intuitive.

I can give you access to the dataset too if that would help.

I gave you access to the dataset here.

Thanks @thiagohersan . I appreciate it.
The code looks pretty much like what I need barring minimal changes for my HF structure.
The dataset is still inaccessible despite the fact I got an email with access granted, but don’t worry about it - I don’t need it.
Thanks again!