Downloading larger models with xet fails on macOS

Dear all,

I’m facing a rather weird issue when downloading larger models (> 14B or more than 35-36 GB) from the hub. The download stopps and fails after some time with the following error message:

RuntimeError: Data processing error: CAS service error : IO Error: Too many open files (os error 24)

Before the error occurs xet-core is throwing a lot of warnings, which I can’t really decipher. It looks/feels a bit like, if it’s running in some kind of debug mode. I hence assume that this error is caused by xet-core.

I have tried downloading multiple models (Qwen2.5-72B, Llama-3.3-70B, Mistral-small-24B, Qwen3-8B and Phi4). The error occurs alsways after downloading around 35-36 GB of data.

I’m on macOS 15.5 using Python 3.12, transformers 4.51.3, hf-xet 1.1.3 and huggingface-hub 0.32.4. I tried to download the models from a notebook cell, using both xlm-lm and transformers from_pretraioned(), and via the huggingface-cli.

Has someone else encountered similar issues? Is it possible to avoid using xet and instead use LFS?

Best

1 Like

Maybe an issue and seems fixed.

This is fixed in the hf-xet 1.1.2 release. Please verify working on your end and close, thanks!
pip install “huggingface_hub[hf_xet]”

Thank you for your answer. I’ve tried both hf_xet 1.1.2 and 1.1.3, encountering the same issue and behaviour as described above.

1 Like

Hi @h4rz3rk4s3 - Xet team member here checking in :hugs: - sorry you’re encountering this! @John6666 is correct, the Too many open files (os error 24) you’re encountering should’ve been addressed in a recent set of releases.

Could you tell me a bit more about your setup? Specifically:

  • Network speed
  • Disk setup
  • Any HF_XET* environment variables (can see these by doing a env | grep "HF_XET")

To answer your question:

Is it possible to avoid using xet and instead use LFS?

There is an environment variable you can set to disable hf-xet and instead download a Xet-backed file using the LFS bridge - see HF_HUB_DISABLE_XET. While this isn’t exactly the same as using LFS, it may help in this specific instance.

1 Like