Hi Everyone,
I created several config, one per lang, for that dataset : EuropeanParliament/cellar_eurovoc · Datasets at Hugging Face
configs:
- config_name: default
data_files: cellar.jsonl.gz
- config_name: en-GB
data_files: cellar_en.jsonl.gz
- config_name: fr-FR
data_files: cellar_fr.jsonl.gz
It’s appear correctly (more or less) in the website but not in the code
from datasets import get_dataset_config_names
get_dataset_config_names("EuropeanParliament/cellar_eurovoc")
I still have one set
Any idea?
Hello @severo , nous avons travaillé dans la meme équipe à l’IRISA a priori, le monde est petit
++
1 Like
Hi! The “dataset configs in the YAML” feature requires the latest release of datasets
(>=2.14.0
), so updating the installation with pip install -U datasets
should fix the problem.
Thx I updated it but I may have a cache issue now
import datasets
datasets.__version__
‘2.14.4’
and
from datasets import get_dataset_config_names
get_dataset_config_names("EuropeanParliament/cellar_eurovoc")
if you want to test, give
AttributeError: ‘NoneType’ object has no attribute ‘BUILDER_CONFIG_CLASS’
Try passing download_mode="force_redownload"
to redownload the files or, if this doesn’t help, cache_dir="path/to/new/cache_dir"
to use a different directory for caching.