opened 07:48AM - 21 Jun 21 UTC
dataset request
vision
## Adding a Dataset
- **Name:** COCO
- **Description:** COCO is a large-scale …object detection, segmentation, and captioning dataset.
- **Paper + website:** https://cocodataset.org/#home
- **Data:** https://cocodataset.org/#download
- **Motivation:** It would be great to have COCO available in HuggingFace datasets, as we are moving beyond just text. COCO includes multi-modalities (images + text), as well as a huge amount of images annotated with objects, segmentation masks, keypoints etc., on which models like DETR (which I recently added to HuggingFace Transformers) are trained. Currently, one needs to download everything from the website and place it in a local folder, but it would be much easier if we can directly access it through the datasets API.
Instructions to add a new dataset can be found [here](https://github.com/huggingface/datasets/blob/master/ADD_NEW_DATASET.md).