I wonder if it is possible to get the “task_categories” entry of a dataset via the Python library. For example, the dataset “bing_coronavirus_query_set” is an “intent classification” set.
This entry is associated with almost every dataset in the Git repo but I couldn’t find how to read it out once we load a dataset using the datasets library.
Hi! You can use huggingface_hub.list_datasets
to retrieve the task_ids
/task_categories
tags of a dataset. In particular, bing_coronavirus_query_set
’s tags can be fetched as follows:
import huggingface_hub
card_data = huggingface_hub.list_datasets(search="bing_coronavirus_query_set", full=True)[0].cardData
print(card_data["task_ids"])
print(card_data["task_categories"])