Out of no where: requests.exceptions.ReadTimeout: HTTPSConnectionPool (host='huggingface.co', port=443): Read timed out

WonderYear1905 · June 14, 2023, 4:23am

Hi, Im streaming laion2b dataset using:

self.dataset = load_dataset("laion/laion2b-en", streaming=True,split="train")

And Im getting this error:

requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out.

This is not he interesting part, whats interesting is that it worked for two weeks in a row, out of no where, the streaming stopped and now I cant run (getting error above).
My network manager says nothing changed in the configuration/proxy or anything else, did something change from the “datasets” package side?

The full trace is:

  File "/workspace/dir/dir_env/lib/python3.8/site-packages/datasets/load.py", line 1502, in load_dataset_builder
    dataset_module = dataset_module_factory(
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/datasets/load.py", line 1219, in dataset_module_factory
    raise e1 from None
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/datasets/load.py", line 1186, in dataset_module_factory
    raise e
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/datasets/load.py", line 1160, in dataset_module_factory
    dataset_info = hf_api.dataset_info(
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/huggingface_hub/hf_api.py", line 1666, in dataset_info
    r = get_session().get(path, headers=headers, timeout=timeout, params=params)
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/requests/sessions.py", line 600, in get
    return self.request("GET", url, **kwargs)
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/workspace/dir/dir_env/lib/python3.8/site-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=100.0)

mbpaulus · June 14, 2023, 11:23am

I am facing the same issue with cerebras/SlimPajama-627B
python3.9
datasets: 2.12.0
huggingface 0.0.1
huggingface-hub 0.13.4

JatinnG · June 14, 2023, 11:35am

Facing the same issue with tiiuae/falcon-7b-instruct. Working on HuggingFace 0.0.1 with similar boilerplate as mentioned above.

The models that I have already pre-installed locally seems to work fine though.

mariosasko · June 14, 2023, 2:00pm

The Hub had a minor outage. Can you please try again and report if it works now?

WonderYear1905 · June 15, 2023, 5:04am

Thanks for you help, unfortunately the problem still occurs.

maribelrb · June 15, 2023, 10:52am

transformers==4.30.2 works for me

ancalita · June 21, 2023, 12:16pm

We are seeing the same issue in some CI jobs that leverage huggingface models, for example:

[ATO-1151] Parentheses in button title overwrite button payload in rasa shell · RasaHQ/rasa@3422ead · GitHub
[ATO-1151] Parentheses in button title overwrite button payload in rasa shell · RasaHQ/rasa@3422ead · GitHub

fkov · July 14, 2023, 10:23am

I have the same issue right now (I also have problem when I want to upload file/ folder / model / dataset, there are other times when model and dataset is loaded to cache, cloning to my hub repo to local is done but out of nowhere the job is killed and training does not start)

fkov · July 14, 2023, 10:24am

is there a way to monitor when there is an outage - a webpage with outages that are happening so that we don’t waste time on looking for solutions?

mariosasko · July 14, 2023, 11:50am

You can check the status here: https://status.huggingface.co/

fkov · July 14, 2023, 12:08pm

this is useful thank you, @mariosasko

abeiler · October 3, 2023, 6:26pm

Has anyone figured out how to resolve this issue when it appears? According to the HuggingFace Status page, everything is currently operational.

I started having the problem today with a training job in SageMaker which worked last night and the only thing that has changed is the dataset I’m using for training and that downloaded fine. It times out part way through downloading safetensors files for meta-llama/Llama-2-7b-hf.

@mariosasko Any other insights or potential awareness of an outage that is not showing up on the status page yet? Thanks!

Using the following versions:

accelerate-0.21.0
bitsandbytes-0.40.2 
huggingface-hub-0.17.3 
optimum-1.13.2 
peft-0.4.0 
safetensors-0.3.3 
sagemaker-training-4.7.0 
transformers-4.33.3

Error Log:

ErrorMessage "TimeoutError: The read operation timed out
 
 During handling of the above exception, another exception occurred
 Traceback (most recent call last)
 File "/opt/conda/lib/python3.10/site-packages/requests/models.py", line 816, in generate
 yield from self.raw.stream(chunk_size, decode_content=True)
 File "/opt/conda/lib/python3.10/site-packages/urllib3/response.py", line 628, in stream
 data = self.read(amt=amt, decode_content=decode_content)
 File "/opt/conda/lib/python3.10/site-packages/urllib3/response.py", line 566, in read
 with self._error_catcher()
 File "/opt/conda/lib/python3.10/contextlib.py", line 153, in __exit__
 self.gen.throw(typ, value, traceback)
 File "/opt/conda/lib/python3.10/site-packages/urllib3/response.py", line 449, in _error_catcher
 raise ReadTimeoutError(self._pool, None, "Read timed out.")
 urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out.
 File "/opt/ml/code/run_clm.py", line 362, in <module>
 main()
 File "/opt/ml/code/run_clm.py", line 358, in main
 raise e
 File "/opt/ml/code/run_clm.py", line 347, in main
 training_function(args)
 File "/opt/ml/code/run_clm.py", line 229, in training_function
 model = AutoModelForCausalLM.from_pretrained(
 File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
 return model_class.from_pretrained(
 File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2869, in from_pretrained
 resolved_archive_file, sharded_metadata = get_checkpoint_shard_files(
 File "/opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py", line 1040, in get_checkpoint_shard_files
 cached_filename = cached_file(
 File "/opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py", line 429, in cached_file
 resolved_file = hf_hub_download(
 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
 return fn(*args, **kwargs)
 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1431, in hf_hub_download
 http_get(
 File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 551, in http_get
 for chunk in r.iter_content(chunk_size=10 * 1024 * 1024)
 File "/opt/conda/lib/python3.10/site-packages/requests/models.py", line 822, in generate
 raise ConnectionError(e)
 requests.exceptions.ConnectionError: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out.
 Downloading (…)of-00002.safetensors:  54%|█████▎    | 5.35G/9.98G [00:21<00:18, 252MB/s]"

mahnazislam1 · October 17, 2023, 5:02am

One solution is to add the parameter of resume_download=True in from_pretrained or where the error occurs and then just rerun the code. Suppose your download was finished in 10% progress previously, now upon rerunning the code it will not start from the beginning, rather it will continue downloading from 10% or where it was finished previously.
For example:

from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5,resume_download=True)

krishnasharma1 · July 29, 2024, 5:39am

Any solutions you got?

Topic		Replies	Views
Error while downloading a repo from Hugging Face : Read timed out 🤗Datasets	2	11003	June 28, 2023
TimeoutError [100060]? Beginners	24	414	November 29, 2024
Unable to Train for a Long Time 🤗Datasets	4	1866	February 16, 2023
ReadTimeoutError when loading model Beginners	5	4381	October 18, 2024
Docs seem to be down? Site Feedback	1	997	April 14, 2021

Out of no where: requests.exceptions.ReadTimeout: HTTPSConnectionPool (host='huggingface.co', port=443): Read timed out

Related topics