I am using load_from_disk
to load a dataset I stored using .save_to_disk
.
This works fine the first time, and I proceed to apply a .map
operation (and train a model).
This stores cache
files in the dataset’s directory, but these result in ValueError: Couldn't cast
followed by DatasetGenerationError: An error occurred while generating the dataset
when trying to load it a second time using the same call to load_from_disk
.
If I remove the cache
files, it works fine again – the first time.
What am I doing wrong here?