The “cache-9aaxxxxx” file should be the one indeed ![]()
Dataset.from_file should work - what takes time is reading the metadata of all the record batches (=chunks of arrow files). It doesn’t load the actual dataset content in memory.
Alternatively you can use IterableDataset.from_file which doesn’t read the metadata, but we haven’t implemented save_to_disk for IterableDataset