tf.data.Dataset.zip (https://www.tensorflow.org/api_docs/python/tf/data/Dataset#zip) is very handy if one wants to combine multiple datasets to a single one horizontally. Is there a similar functionality for hf datasets?
Yes, you can concatenate datasets horizontally with ds = datasets.concatenate_datasets(list_of_datasets, axis=1)
.
Thanks @mariosasko, that brings me a step closer to my goal. With tf.data.Dataset.zip one can do things like tf.data.Dataset.zip({'level1': {'level2_1': ds_1, 'level2_2': ds_2}})
to join the dataset in a nested manner. Is something like this possible with hf datasets?
This requires using .map
on the combined dataset.