I would like to use s3 as storage for a dataset with different configurations. A good example is Pascal VOC dataset here fuliucansheng/pascal_voc at main
The underlying data of all configurations is the same and main difference is how _generate_examples function is executed. Local caching works seems to work well and reuse the images under different configs.
Now I would like to be able to specify dataset + configuration and load it directly from s3. What is the best way to do that?
dataset = load_from_disk("s3:/xxx/xxx", storage_options=storage_options)
but that works for a single configuration apparently.