Yes, I can confirm that it works this way if just having 1 configuration! First, run the script to generate the metadata in the README, then remove the !!python/tuple
from the README, upload the README to huggingface hub, then loading the dataset with datasets.load_dataset() will just work.
However, in my current case, I have 4 configurations (i.e., video
, i3d_rgb
, c3d_rgb
, swin_rgb
), running the following line, will cause the program to stop halfway due to this issue (it can’t be circumvented by removing !!python/tuple
in the middle)
datasets-cli test jherng/xd-violence --save_info --all_configs
My current workaround is to run each of the datasets-cli test
separately as follows:
datasets-cli test jherng/xd-violence --save_info --name video
datasets-cli test jherng/xd-violence --save_info --name i3_rgb
datasets-cli test jherng/xd-violence --save_info --name c3d_rgb
datasets-cli test jherng/xd-violence --save_info --name swin_rgb
Each of them will same come up with the yaml.constructor.ConstructorError
at the end but that’s fine, at least the metadata files will be generated anyway. I copy and paste each of the metadata file and combine them into one as in this (Of course, I removed the all the !!python/tuple
in the file).
Hope this helps!