TimeoutError [100060]?

Dear @John6666 and other community members

TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond During handling of the above exception, another exception occurred:

ConnectionError: (ProtocolError(‘Connection aborted.’, TimeoutError(10060, ‘A connection attempt failed because the
connected party did not properly respond after a period of time, or established connection failed because connected
host has failed to respond’, None, 10060, None)), ‘(Request ID: 34f638f9-0518-48da-aae7-6dcd50daa757)’)

I changed the token, reset the variables, check internet connections, what happened?

1 Like

What environment?

miniConda, does it do with the environment?

1 Like

Using Conda probably means you’re working in a local environment. Judging from the error message, it seems to be a Windows environment. I wonder if the connection from the local environment is not working properly.
Was it working properly until yesterday?

Last time I ran it, it ran into CPU memory problem, then I ran it again and again and faces that. Should I clean some caches? Sorry for the mislicked as solution, haven’t sleeped

1 Like

Oh, if it disappears using the Solution function, I think you can just post it again.:sweat_smile:
If it’s not working once, it’s possible that the environment is not working properly…
However, if you’re doing it from Python, it’s unlikely to be a browser setting, and it’s often the case that it’s due to VPN, SSL, router settings, or internal network settings, but do you have any ideas?

Edit:
Because this is an error that has occurred for various reasons since long ago, it is not possible to identify the cause or solution just from the error.

Yes that’s the problem, when I tried to look for the solution it went more than just huggingface. Let me check some of these.

But it does rains a lot since yesterday, I might need to go out for better connection

HF related items.

Used cat5 cable and the error become this

CalledProcessError: Command ‘[‘git’, ‘pull’]’ returned non-zero exit status 128.

During handling of the above exception, another exception occurred:

OSError: fatal: refusing to merge unrelated histories

1 Like

Well, you need a cat5e cable, but that’s beside the point.
The fact that the symptoms change means that the problem is inside the router, including the router. It doesn’t seem like someone is blocking it. It can probably be resolved, but there is some kind of problem lurking in your environment itself.

Is there a solution to that “Refusing to merge unrelated histories” tho, it also went far beyond huggingface scope.

Yea I think I’ll try using the cable or try going out

1 Like

If you’re trying to use git on Windows, it’s possible to use it as standard, but you’ll need to install these manually to make it practically usable. Let’s install them.

I didn’t use any git commands as far as I can see, is it on the huggingface functions or something?

I obviously would know if I used one and will check my git connections, but it doesn’t seem so. Is it on the dataset taking?

But if it’s the case then this code means nothing, this codes normally would means every variables are local now. I don’t even know why I have internet problems

# load dataset from huggingface
hf_dataset_identifier = "seand0101/manggarai-watergate"
ds = load_dataset(hf_dataset_identifier)
# To configure the model, extract the number of unique labels

repo_id = f"datasets/{hf_dataset_identifier}"
filename = "id2label.json" # Define the filename containing label information

# Download the 'id2label.json' file
id2label = json.load(open(hf_hub_download(repo_id=hf_dataset_identifier, filename=filename, repo_type="dataset"), "r"))

# Convert the keys in 'id2label' from strings to integers and store the result in 'id2label'
id2label = {int(k): v for k, v in id2label.items()}

# Create a 'label2id' dictionary by reversing the key-value pairs in 'id2label'
label2id = {v: k for k, v in id2label.items()}

# Calculate the number of unique labels in the dataset
num_labels = len(id2label)
1 Like

What? I think you’ve got a git error though.

For now, I think it would be a good idea to update the libraries for HF communications-related things.

pip install -U datasets huggingface_hub

Edit:
If you set this to a local path, it should work even without an internet connection, but it’s better to think about it after it’s working with an internet connection.

# load dataset from huggingface
hf_dataset_identifier = "seand0101/manggarai-watergate"
ds = load_dataset(hf_dataset_identifier)
1 Like

Will try that, got it. Do you mind checking my dataset and .json too probably there’s something wrong with my first curated dataset ever.

1 Like

From DatasetViewer, it looks like a normal image dataset, with pixel_values as the image and labels as the masked image. The authors of DatasetViewer and the datasets library are to some extent the same, so if it looks normal in DatasetViewer, it can generally be handled normally in the datasets library.
However, the way you handle it (how you write the code) will naturally change depending on the structure of the dataset.
Your dataset is a combination of images and images, so it’s a little unusual. Datasets that only contain text and images, just texts, and just images are the majority. It should be possible to use it, but you’ll have to search to find out how to use it.

I thought when compared to that sidewalk dataset, they also include masking to their second class. How to convert these masks into “labels” like they do? Noted will check that out thanks again John6666

1 Like

Oh so it doesn’t matter if I cached it first in Jupyter Notebook it will still look for online repo for variable when training?

1 Like

How to convert these masks into “labels” like they do?

I have no idea! A mask is a mask. It’s normal to use a dataset that combines images and masks.
Can it be used to train an image segmentation model?
Basically, you need to specify what you want the AI to learn in some way. If AI doesn’t know what to learn, it’s very inefficient. Of course, it is possible to automate the labeling process to some extent using AI. You can have a smarter, already-trained large AI model do the labeling, and then use that as a dataset to train smaller models. But this wouldn’t be called a conversion. It would be a process.

Oh so it doesn’t matter if I cached it first in Jupyter Notebook it will still look for online repo for variable when training?

No, it’s not that specific. We were talking about how it would be inconvenient if you couldn’t use HF because of your local environment, so let’s fix that first.:sweat_smile: