How to update requests.Session with proxies and verify parameter to be used by all huggingface libraries (proxy with TLS interception)

Hi there,

I need to overwrite the requests.session to be able to use HuggingFace (models, datasets) in a company setup with proxy and TLS interception. As this is not working with CA certificate I need to deactivate it.

It seems that https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/hf_api.py is used in all cases like when downloading a dataset.

Following the documentation I tried the following

def backend_factory() -> requests.Session:
    session = requests.Session()
    session.proxies = {"http": "proxy_server:port", "https": "proxy_server:port"}
    session.verify = False
    return session

configure_http_backend(backend_factory=backend_factory)

session = get_session()

I can see when using the get_session that session.proxies and verify are set properly but this doesn’t work.

I am using the following key libraries:

python                     3.10.12
huggingface_hub    0.16.4
datasets                  2.13.0
transformers            4.30.2
requests                  2.31.0

Is there something wrong in the way I am trying to overwrite the session parameters ? I am not sure if the full requests.session is passed (should work then) or if only requests.session.proxies is used (seems the latest when quickly looking at the code hf_api.py)

Any suggestion on how to get this working ?

Thanks

After some investigation it seems that there few issues with HuggingFace library to work with a proxy server with TLS and CA certificates. This impact transformers and datasets libraries. I will open a bug report.

2 Likes

Any updates on this? similar working environment issue
$ git config --local http.sslVerify false
before fetching