Efficient way to concatenate DatasetDict objects

Is there any efficient way to combine DatasetDict objects? Current concatenate method works for Dataset object only. Thanks!

No, but if all of them have the same set of keys, this should work:

dd = DatasetDict()
for key in dd_to_concat:
    dd[key] = datasets.concatenate_datasets([ddd[key] for ddd in dd_to_concat])

PS: our goal is to remove the DatasetDict API eventually, as explained in Reduce friction in tabular dataset workflow by eliminating having splits when dataset is loaded · Issue #5189 · huggingface/datasets · GitHub, so it’s unlikely that we will add support for DatasetDict objects in concatenate_datasets.