Index retrieval speed varies considerably with dataset size

Hi ! For long arrays it is faster to read the data as numpy rather than python objects. Indeed, datasets are stored in the Arrow format, which allows zero-copy read to numpy.

To make your query run faster, you can do dummy_ds = dummy_ds.with_format("numpy").