CAS service error when downloading gated models on Databricks even with HF_HUB_DISABLE_XET=1

I’m unable to download gated models (e.g., mistralai/Mistral-7B-Instruct-v0.2) using huggingface_hub from within a Databricks cluster. Despite setting HF_HUB_DISABLE_XET=1 and removing any hf-xet or hf_transfer packages, the library continues attempting to contact cas-bridge.xethub.hf.co, which results in a repeated “RuntimeError: Data processing error: CAS service error : ReqwestMiddleware Error: Request failed after 5 retries”

  • :white_check_mark: Confirmed token works by downloading model on a local machine
  • :white_check_mark: Set all environment variables (HF_HUB_DISABLE_XET, HF_HUB_ENABLE_HF_TRANSFER)
  • :white_check_mark: Downgraded huggingface_hub to versions like 0.21.4, 0.23.0, and 0.30.2
  • :white_check_mark: Verified that hf-xet is not installed (pip list, !find ~/.cache -name 'xet')
  • :white_check_mark: Confirmed the error is triggered before any fallback happens
  • :white_check_mark: Manually tried using hf_hub_download as well — same issue
  • :white_check_mark: Upgraded hf-xet to latest version - still the same error
1 Like

It is unclear whether the cause is the same, but similar errors seem to have been reported.

that is correct, it is exactly the same error reported by GohioAC here

1 Like

Hi @manjusavanth thanks for the report - Xet team member here.

This does seem related to a few issues we’ve encountered recently, although you should be able to fall back to HTTP download through HF_HUB_DISABLE_XET=1.

How are you downloading mistralai/Mistral-7B-Instruct-v0.2? Is it through the huggingface-cli or one of the core Python function (e.g., snapshot_download)?

Could you tell me anything more about the Databricks environment?

1 Like

Hi @jsulz I have tried using HF_HUB_DISABLE_XET=1, this does not work for me.

Below is the complete code:
%pip uninstall -y hf-xet huggingface_hub
%pip install huggingface-hub
%pip install hf_xet==v1.1.6rc2
%pip install vllm==0.8.5
import os
from huggingface_hub import login
login(token=“token_id”)

from vllm import *
! python -m vllm.entrypoints.openai.api_server --model mistralai/Magistral-Small-2506 --dtype float16 --tensor-parallel-size 4 --port 8003 --max_model_len 15000 --tokenizer-mode “mistral”

on Databricks, I have run the code on clusters of V100 and T4 GPUs. These are the cluster spinned dedicatedly for the ML job without having pre-installed python packages.

1 Like

Thanks for those details @manjusavanth

Based on what I see here, you uninstall hf-xet but then reinstall it on line three (%pip install hf_xet==v1.1.6rc2). Regardless, the HF_HUB_DISABLE_XET flag, when turned on, should work. The issue with the flag may be related to this issue on the huggingface_hub repo. I would suggest posting about your experiences there as well.

As for the runtime error you are encountering, I believe that is related to a known issue we are seeing with the vllm library. You should be able to get around that by falling back to HTTP download with HF_HUB_DISABLE_XET (which appears to not work for you at the moment) or uninstalling hf-xet. If the HF_HUB_DISABLE_XET flag is not working for you, I would suggest running pip uninstall -y hf-xet after the installation of huggingface-hub and do not reinstall it.

I’ll follow up here once the hf-xet issue with vllm is addressed, and let me know if you have any questions.

1 Like

@manjusavanth we believe we’ve addressed the root cause of the CAS service error you were seeing. You can pip install a release candidate for testing. I.e.,

pip install hf-xet==1.1.6rc5

1 Like

Hi @jsulz , I have tried with pip install hf-xet==1.1.6rc5, this gives the same error as earlier. I changed nothing else apart from this line pip install hf-xet==1.1.6rc5.

1 Like

Thanks for testing @manjusavanth! We’ll keep investigating.

To make sure you’re unblocked and can download mistralai/Mistral-7B-Instruct-v0.2 did you see my earlier comment with respect to how you are loading in hf-xet?

I would review your code to ensure that either hf-xet is not installed and/or your environment recognizes the HF_HUB_DISABLE_XET. If, for whatever reason, HF_HUB_DISABLE_XET isn’t working for you, I would add your reproduction steps to the GitHub issue.

1 Like

Hi @jsulz I did try to install huggingface-hub first and then uninstalling the hf-xet. Also set the flag “HF_HUB_DISABLE_XET” to 1. But I continue to receive the same error.

I also check for the presence of xet after uninstaaling, there is no xet, but the CAS error continues.

import os
import glob
xet_bin = glob.glob(os.path.expanduser(“~/.cache/huggingface/hub/extensions/**/xet”), recursive=True)
print(“XET binaries found:”, xet_bin)

XET binaries found:

1 Like

I believe the issue with HF_HUB_DISABLE_XET may be related to the issue here HF_HUB_DISABLE_XET not disabling XET-based downloads · Issue #3266 · huggingface/huggingface_hub · GitHub

Can you confirm that you set the environment variable before you load the huggingface_hub library?

1 Like

hi @jsulz I have tried setting the flag for HF_HUB_DISABLE_XET both before and after importing the huggingface_hub library, nothing seems to change as I get the same CAS error, this issue has become a pain as I have not been able to download the model for last 20days. I am not sure vLLM is adding to the issue.

1 Like

This turned out to be the ip whitelisting issue. After getting the below urls whitelisted, the model download worked with xet.

transfer.xethub.hf.co

2 Likes

@manjusavanth ah, I’m sorry, that should’ve been the first thing I asked :person_facepalming:

Glad you resolved this and sorry for the runaround.

1 Like

Thank you for your time and guidance.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.