Once you have you processed data, you can save them using save_to_disk
to save them in the directory of your choice. When this is done, you can completely delete your cache and reload your processed dataset with load_from_disk
for example (see the documentation)
Also note that if you really need to, you can load any arrow file with Dataset.from_file("path/to/any/arrow/file")