How to get a list of all Huggingface download redirections to whitelist?

I work inside a secure corporate VPN network, so I’m unable to download Huggingface models using from_pretrained commands. However, I can request the security team to whitelist certain URLs needed for my use-case.

The security team has already whitelisted the ‘huggingface.co’ and ‘cdn-lfs.huggingface.co’ URLs. I can now download the files from repo but the loading functions from_pretrained still don’t work.

I think it’s getting blocked while redirecting the requests internally. So, is there a way to know all (hop) URLs I can request to whitelist to make the load functions work?

Thanks in advance.

3 Likes

hi @ayadav

Can you give more details, like error logs, etc?

Is there any update on this?

1 Like

Having the same issue. Is there a listing of URLs that we can whitelist? Also if there are any planned changes to URLs is there a roadmap so we can stay on top of it?

1 Like

I’ll try to supply error logs next time I encounter it, but it has come up multiple times for me as well. When we try to call <model>.from_pretrained("repo") in our DataBricks environment, we get an SSL error about not having the proper certificate. We’ve also gotten a max_retries error but I can’t say for certain if that was due to the underlying whitelist request. There are ways around this, but if HF published a domain list that we could use to properly configure our environments, that would be very useful!

2 Likes

hi! any updates on this? or any alternatives to follow meanwhile? I am about to try downloading a model and going offline and then pushing it up to databricks. Yet, if you had a better idea, or tried this before, I’d like to hear.

I have same issue with download from different cdn name.
After our IT team added
http://huggingface.co/ and
http://cdn-lfs.huggingface.co/ in whitelist.

For example, it is work for download meta-llama/Llama-2-13b-chat.
But error when the cdn become cdn-lfs-us-1.huggingface.co or other regions.

Update? Same issue here. I’ve gotten around by using my home network to connect to the hf repo and download to my workstation cache. Then I reconnect to VPN into the corporate network and copy from my workstation to the server cache. This is painfully slow.

FWIW curl -IL test shows redirection (302 responses) from the repo when I am connected to the corporate network (fails to download). However on my home network there are no redirects (successful download). Is there an issue with general redirection handling?