HF cache no space left on device

Hello everyone,

I’m creating a HF dataset in an Azure VM by first reading a file from generator, doing some pre processing and then using save to disk. Im saving to disk on a blobfuse mount.

In the middle of the save_to_disk operation I was running out of disk, and i think it is related to HF cache files that are being stored in default “local” location, instead of leveraging the mount.

Now im settinf HF_HOME environment variable to a directory in the mount, right before importing datasets. However, now I’m getting an OS Error: no space left on device, even there is plenty of space and i am pointing the cache to the mount.

The top of my code looks like this:

Import os
os.environ[“HF_HOME”] = /mnt/outputs/hf_cache/
From datasets import …

I am not creating the hf_cache directory , though it seems HF creates it automatically.

Any idea what might be happening?

The code fails right at the beginning, just when i call the from_generator an before i do anything.

Thank you very mch!

1 Like

Hmm… @lhoestq

I’ve dug into HF code based on the error message I receive below, and I think I have an idea of what’s going on:

File “/opt/conda/envs/ptca/lib/python3.10/site-packages/datasets/io/generator.py”, line 49, in read
self.builder.download_and_prepare(
File “/opt/conda/envs/ptca/lib/python3.10/site-packages/datasets/builder.py”, line 875, in download_and_prepare
raise OSError(
OSError: Not enough disk space. Needed: Unknown size (download: Unknown size, generated: Unknown size, post-processed: Unknown size)

When I use disk_usage from shutil to print the disk space of the cache dir located in the mount, it prints 0 GB. The download_and_prepare function inside the builder then reads this and outputs there is no storage on the device.

Is it possible that this is a characteristic of the mount with blobfuse?

1 Like

In a cloud environment, there are a lot of problems that can occur, such as the cache being deployed to a directory without permissions…

However, if it was working until a short time ago, maybe it can be restored by downgrading the library?

BTW, in the Hugging Face-related libraries, loading and saving will behave differently depending on the version of the library. The following is an example of fixing it to a very old version (from October last year).

pip install huggingface_hub==0.25.2

datasets uses shutil.disk_usage() to know if there is enough space before writing a (potentially huge) dataset

(maybe if it says zero it can let the writing begin - it should fail anyway if it’s really zero under the hood ? that part of datasets is open to contributions btw if you want to improve it)

1 Like