HF cache no space left on device

samirchar · March 29, 2025, 7:18pm

Hello everyone,

I’m creating a HF dataset in an Azure VM by first reading a file from generator, doing some pre processing and then using save to disk. Im saving to disk on a blobfuse mount.

In the middle of the save_to_disk operation I was running out of disk, and i think it is related to HF cache files that are being stored in default “local” location, instead of leveraging the mount.

Now im settinf HF_HOME environment variable to a directory in the mount, right before importing datasets. However, now I’m getting an OS Error: no space left on device, even there is plenty of space and i am pointing the cache to the mount.

The top of my code looks like this:

Import os
os.environ[“HF_HOME”] = /mnt/outputs/hf_cache/
From datasets import …

I am not creating the hf_cache directory , though it seems HF creates it automatically.

Any idea what might be happening?

The code fails right at the beginning, just when i call the from_generator an before i do anything.

Thank you very mch!

John6666 · March 29, 2025, 8:09pm

Hmm… @lhoestq

samirchar · March 30, 2025, 12:43pm

I’ve dug into HF code based on the error message I receive below, and I think I have an idea of what’s going on:

File “/opt/conda/envs/ptca/lib/python3.10/site-packages/datasets/io/generator.py”, line 49, in read
self.builder.download_and_prepare(
File “/opt/conda/envs/ptca/lib/python3.10/site-packages/datasets/builder.py”, line 875, in download_and_prepare
raise OSError(
OSError: Not enough disk space. Needed: Unknown size (download: Unknown size, generated: Unknown size, post-processed: Unknown size)

When I use disk_usage from shutil to print the disk space of the cache dir located in the mount, it prints 0 GB. The download_and_prepare function inside the builder then reads this and outputs there is no storage on the device.

Is it possible that this is a characteristic of the mount with blobfuse?

John6666 · March 30, 2025, 12:58pm

In a cloud environment, there are a lot of problems that can occur, such as the cache being deployed to a directory without permissions…

However, if it was working until a short time ago, maybe it can be restored by downgrading the library?

John6666 · March 30, 2025, 1:05pm

BTW, in the Hugging Face-related libraries, loading and saving will behave differently depending on the version of the library. The following is an example of fixing it to a very old version (from October last year).

pip install huggingface_hub==0.25.2

lhoestq · March 31, 2025, 2:05pm

datasets uses shutil.disk_usage() to know if there is enough space before writing a (potentially huge) dataset

(maybe if it says zero it can let the writing begin - it should fail anyway if it’s really zero under the hood ? that part of datasets is open to contributions btw if you want to improve it)

Topic		Replies	Views
SageMaker OS Error No Space Left On Device while trying to train Falcon40B Amazon SageMaker	3	1310	August 24, 2023
"No space left on device" when using HuggingFace + SageMaker Amazon SageMaker	39	25760	October 10, 2023
"no space left on device" when downloading a large model for the Sagemaker training job Amazon SageMaker	4	5004	July 18, 2024
Load_dataset("file_location") disk quota out of space 🤗Datasets	1	1108	April 18, 2024
Cannot use Hugging Face cache on a read-only filesystem 🤗Transformers	3	336	January 29, 2025

HF cache no space left on device

Related topics