Iterable of batches from IterableDataset

According to the documentation, I should be able to use the batch method on an iterabledataset. However, the following code gives an AttributeError: ‘IterableDataset’ object has no attribute ‘batch’.

dataset = load_dataset("parquet", data_files="part-*-b4b8fd5e-a0a7-45e2-9b70-7c526ae44202-c000.zstd.parquet", streaming=True)

dataset['train'].batch(batch_size=32)

I had the same issue. Upgrading ‘datasets’ worked for me.

1 Like