I have a server equipped with GPUs without internet access. I would like to run some experiments there, and for that I need to download datasets locally and move the downloaded files on the server.
What is the correct procedure to do that? I just copied the
.cache/huggingface/datasets directory hoping it would work, but the library still tries to access the internet. I think this may be related to the fact that some metadata (especially a lock file) in there seems to be tied to the user on my local machine, which is different from the server.
I tried to explicitly pass
download_mode="reuse_cache_if_exists", and I also tried to pass
data_dir directly, but I did not manage to load the cached dataset directly from disk in any case. An example even just with the
mnist dataset would be welcome!