Is datasets cache operating system agnostic?

I’m using Windows with WSL. I run my Python code in WSL most of the time, but occasionally will run code in Windows.

I would like to have my Hugging Face cache shared between the two OSes (and also avoid having to increase the size of my WSL virtual drive).

My plan is to add the following to my .bashrc file in WSL, pointing to a location ‘outside’ the WSL drive.

export HF_DATASETS_CACHE="/mnt/c/Users/david/.cache/huggingface/datasets"

Such that cache from within WSL is written to the same directory as cache when run from Windows.

My question is, when Hugging Face saves things to cache, does it make the assumption that cache will be read from the same operating system that writes it? Or does it guarantee OS-agnostic/portable files?

Linking our discussion here in case it may help other people: Cache is not transportable · Issue #5585 · huggingface/datasets · GitHub

1 Like