How to update requests.Session with proxies and verify parameter to be used by all huggingface libraries (proxy with TLS interception)

Hi there,

I need to overwrite the requests.session to be able to use HuggingFace (models, datasets) in a company setup with proxy and TLS interception. As this is not working with CA certificate I need to deactivate it.

It seems that is used in all cases like when downloading a dataset.

Following the documentation I tried the following

def backend_factory() -> requests.Session:
    session = requests.Session()
    session.proxies = {"http": "proxy_server:port", "https": "proxy_server:port"}
    session.verify = False
    return session


session = get_session()

I can see when using the get_session that session.proxies and verify are set properly but this doesn’t work.

I am using the following key libraries:

python                     3.10.12
huggingface_hub    0.16.4
datasets                  2.13.0
transformers            4.30.2
requests                  2.31.0

Is there something wrong in the way I am trying to overwrite the session parameters ? I am not sure if the full requests.session is passed (should work then) or if only requests.session.proxies is used (seems the latest when quickly looking at the code

Any suggestion on how to get this working ?


After some investigation it seems that there few issues with HuggingFace library to work with a proxy server with TLS and CA certificates. This impact transformers and datasets libraries. I will open a bug report.


Any updates on this? similar working environment issue
$ git config --local http.sslVerify false
before fetching

Is there any updates on this? dataset streaming is the last thing I need to get my training started