Load_dataset(): how to skip Starting new HTTPS connection (1): storage.googleapis.com:443

Hi,

I’ve created a Datasets script that works only with local files. However, load_dataset() seems to start with

Starting new HTTPS connection (1): storage.googleapis.com:443

Currently, perhaps of our VPN I also get the message

WARNING:HF google storage unreachable. Downloading and preparing it from source

What is that connection for, and is it possible to disable it?

Kind regares,

Ramon.

Hi ! The datasets lib checks for datasets that are already processed on the HF google storage, so that you don’t have to run the data processing over the raw fiels yourself and save you time (e.g for the wikipedia dataset).

You can set the library to work offline by setting the environment variable HF_DATASETS_OFFLINE=1