Failed to resolve 'huggingface.co' problem in HF Spaces Building

several weeks ago my Spaces “SpeechDepression” and many other spaces ran well. but reccently a number of Spaces stop working. when rebuilding, it returns “Runtime Error”. The details are below:

firstly the spaces often stop in this status:

===== Build Queued at 2024-03-21 01:46:42 / Commit SHA: 1d59a73 =====

--> FROM docker.io/library/python:3.10@sha256:f9307a98b4ca854bfeb342f7a9c8402557e869a190c4d78ae57157ae82ce8c0d
DONE 0.0s

the spaces often stop in this status as well:

===== Application Startup at 2024-03-21 01:49:31 =====

"spaces" in pod "r-liusuthu-speechdepression-ihdmhxq2-d2c68-ksqx6" is waiting to start: ContainerCreating

finally it returns runtime error!

Runtime error
: Failed to resolve 'huggingface.co' ([Errno -3] Temporary failure in name resolution)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/app/speech.py", line 28, in <module>
    client = Client("Liusuthu/TextDepression")
  File "/usr/local/lib/python3.10/site-packages/gradio_client/client.py", line 120, in __init__
    _src = self._space_name_to_src(src)
  File "/usr/local/lib/python3.10/site-packages/gradio_client/client.py", line 726, in _space_name_to_src
    return huggingface_hub.space_info(space, token=self.hf_token).host  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2345, in space_info
    r = get_session().get(path, headers=headers, timeout=timeout, params=params)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 67, in send
    return super().send(request, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError('HTTPSConnectionPool(host=\'huggingface.co\', port=443): Max retries exceeded with url: /api/spaces/Liusuthu/TextDepression (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f5c0ec6f850>: Failed to resolve \'huggingface.co\' ([Errno -3] Temporary failure in name resolution)"))'), '(Request ID: 81782cf9-9899-45c4-884a-733ac0618b73)')

by the way im from China Mainland, idk whether it’s because the huge firewall of our country. But everything really ran well several weeks ago. :cry:

Same issue here. (I’m from Europe so not a firewall issue).
Building space lead to a long waiting time… Resulting in either an error either a space with UI working but buttons actions not loading models when clicking on Submit per example.

I think it is a Huggingface.co hub bug. So perhaps waiting some hours for a fix from HuggingFace staff.

maybe
btw can this blog/discussion seen by them? i have no idea how to get in touch with the developers :face_with_raised_eyebrow:

Hi @Liusuthu, @FashionStash, Thanks for reporting. Have you tried restarting the Space(s)? Afterwards, do you continue encounter the runtime error? Please let us know.

@michellehbn thx for your notice!

yes of course i tried, all that i’ve tried including: restarting the spaces, changing the Space Hardware(basic and upgrade), changing the code files(to indirectly rebuild the space), and using gradio deploy to deploy from my local device.

but unfortunately none of these worked.

Hi @Liusuthu, Thanks for trying and for the details. We just prodded a fix that should help. Please let us know if that resolves the issue or if you continue to run into an error. Thanks again!

hi @michellehbn , thanks for your in-time help!

just now i’ve tried to restart all my “Runtime Error” and “Sleeping” Spaces, luckily, most of them run well now. But i still feel the waiting time is longer than before.

and one of my Spaces:https://huggingface.co/spaces/Liusuthu/Copy-Facial-Expression-Recognition is still not working, the error messages are below:

Error loading model: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /ElenaRyumina/face_emotion_recognition/resolve/main/FER_static_ResNet50_AffectNet.pt (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f83e3b23520>: Failed to resolve 'huggingface.co' ([Errno -3] Temporary failure in name resolution)"))
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 531, in _check_seekable
    f.seek(f.tell())
AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/app/app.py", line 14, in <module>
    from app_utils import preprocess_image_and_predict, preprocess_video_and_predict, preprocess_video_and_rank
  File "/home/user/app/app_utils.py", line 17, in <module>
    from model import pth_model_static, pth_model_dynamic, cam, pth_processing
  File "/home/user/app/model.py", line 33, in <module>
    pth_model_static.load_state_dict(torch.load(path_static))
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 986, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 440, in _open_file_like
    return _open_buffer_reader(name_or_buffer)
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 425, in __init__
    _check_seekable(buffer)
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 534, in _check_seekable
    raise_err_msg(["seek", "tell"], e)
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 527, in raise_err_msg
    raise type(e)(msg)
AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

i manually looked into the model:
https://huggingface.co/ElenaRyumina/face_emotion_recognition/tree/main,
every thing is just there(so it’s not because the files are deleted), so i think there are still some HTTPConnect problems, do you have any idea about it?

thanks again!! :smiley:

Hi @michellehbn

i have something to tell! just now i opened my HF profile, and Surprisingly find that the space:
https://huggingface.co/spaces/Liusuthu/Copy-Facial-Expression-Recognition
starts working!! i didn’t retry to restart it. so i don’t know why it works now hahah.

but according to my previous experience, i’m wondering whether any of my spaces will run error unexpectedly? what can i do to get a more stable Space(s) environment?

sorry but today the space:
https://huggingface.co/spaces/Liusuthu/Copy-Facial-Expression-Recognition

turn to the state of “Runtime Error” again… :cry:

then i try to restart it, it was rebuilt sucessfully. i don’t know what happened…but this really bothers me alot