Datasets-cli test failed when generating metadata due to the use of Array2D

jherng · January 6, 2024, 6:33am

Yes, I can confirm that it works this way if just having 1 configuration! First, run the script to generate the metadata in the README, then remove the !!python/tuple from the README, upload the README to huggingface hub, then loading the dataset with datasets.load_dataset() will just work.

However, in my current case, I have 4 configurations (i.e., video, i3d_rgb, c3d_rgb, swin_rgb), running the following line, will cause the program to stop halfway due to this issue (it can’t be circumvented by removing !!python/tuple in the middle)

datasets-cli test jherng/xd-violence --save_info --all_configs

My current workaround is to run each of the datasets-cli test separately as follows:

datasets-cli test jherng/xd-violence --save_info --name video
datasets-cli test jherng/xd-violence --save_info --name i3_rgb
datasets-cli test jherng/xd-violence --save_info --name c3d_rgb
datasets-cli test jherng/xd-violence --save_info --name swin_rgb

Each of them will same come up with the yaml.constructor.ConstructorError at the end but that’s fine, at least the metadata files will be generated anyway. I copy and paste each of the metadata file and combine them into one as in this (Of course, I removed the all the !!python/tuple in the file).

Hope this helps!

Topic		Replies	Views
Can't load script-based dataset, clearing I'm doing something wrong 🤗Datasets	1	272	January 6, 2024
Custom loading dataset script 🤗Datasets	4	515	January 3, 2023
Create Dataset with metadata 🤗Datasets	1	1374	November 28, 2022
Datasets.load_datasets fails 🤗Datasets	12	847	October 11, 2024
Sharing a community provided dataset Beginners	3	451	October 4, 2020

Datasets-cli test failed when generating metadata due to the use of Array2D

Related topics