I work inside a secure corporate VPN network, so I’m unable to download Huggingface models using
from_pretrained commands. However, I can request the security team to whitelist certain URLs needed for my use-case.
The security team has already whitelisted the ‘huggingface.co’ and ‘cdn-lfs.huggingface.co’ URLs. I can now download the files from repo but the loading functions
from_pretrained still don’t work.
I think it’s getting blocked while redirecting the requests internally. So, is there a way to know all (hop) URLs I can request to whitelist to make the load functions work?
Thanks in advance.
Can you give more details, like error logs, etc?
Is there any update on this?
Having the same issue. Is there a listing of URLs that we can whitelist? Also if there are any planned changes to URLs is there a roadmap so we can stay on top of it?
I’ll try to supply error logs next time I encounter it, but it has come up multiple times for me as well. When we try to call
<model>.from_pretrained("repo") in our DataBricks environment, we get an SSL error about not having the proper certificate. We’ve also gotten a
max_retries error but I can’t say for certain if that was due to the underlying whitelist request. There are ways around this, but if HF published a domain list that we could use to properly configure our environments, that would be very useful!
hi! any updates on this? or any alternatives to follow meanwhile? I am about to try downloading a model and going offline and then pushing it up to databricks. Yet, if you had a better idea, or tried this before, I’d like to hear.
I have same issue with download from different cdn name.
After our IT team added
http://cdn-lfs.huggingface.co/ in whitelist.
For example, it is work for download
But error when the cdn become cdn-lfs-us-1.huggingface.co or other regions.