I keep getting KeyError:'safe' when loading my datasets

Everything was working well for months and now i am suddenly getting this error everytime i try to load my dataset. " from datasets import load_dataset
dataset = load_dataset(“sebascorreia/sc09”, split=“test”)
"/usr/local/lib/python3.10/dist-packages/huggingface_hub/hf_api.py in init(self, **kwargs)
636 if security is not None:
637 security = BlobSecurityInfo(
→ 638 safe=security[“safe”], av_scan=security[“avScan”], pickle_import_scan=security[“pickleImportScan”]
639 )
640 self.security = security

KeyError: ‘safe’
" My dataset is just a replica of the SC09 dataset used by the WaveGAN paper. I dont i changed anything about it, i tried upgrading the datasets and the hugginface hub libraries, I tried loading other datasets and i still get the same error. Am i the one considered unsafe or something? I really need this fixed, i am finishing up my evaluation for my dissertation, this is the last thing i needed right now. If anyone knows what is going on please help!

7 Likes

Im having the same error. Were you able to figure out the issue?

2 Likes

Not yet :frowning:

1 Like

Same here. Couldn’t load datasets.

1 Like

I too am getting the same error when loading my private dataset.

1 Like

Me too. I’m getting the same error. Trying to load my public dataset.

1 Like

same ;((((

1 Like

Got the same issue

1 Like

Me, too. Worked yesterday,

2 Likes

Me too, please keep us updated!

1 Like

Also discussed here:

I think there is some issue with the blob/data HF is sending over which is leading to this issues.

The error comes from: site-packages/huggingface_hub/hf_api.py
L638 safe=security[“safe”], av_scan=security[“avScan”], pickle_import_scan=security[“pickleImportScan”]

security data is a nested dict
security
{‘hf’: {‘blobId’: ‘d8bdd5db36bc882309fab2fddecea23622de8fe9’, ‘name’: ‘README.md’, ‘safe’: True, ‘indexed’: False, ‘avScan’: {‘virusFound’: False, ‘virusNames’: None}, ‘pickleImportScan’: None}}

Due to the nested nature, security[‘safe’] is not directly accessible. I did a quick fix by setting, security = security[‘hf’], and the code works. This is just a quick fix, but the right solution should be coming from HF.

3 Likes

Same, i tried many different datasets but same error

Hi,
where do you set this ?

Same :pensive: this error was not present few hours back. HuggingFace please take a look at it.

1 Like

hi, would you please instruct how to set it? thanks!

2 Likes

please show how you did it :pray:

You should find the huggingface_hub package in site-packages of your virtual env.
You can find this path via:

import huggingface_hub
print(huggingface_hub.__file__)
venv/name_of_venv/lib/python3.10/site-packages/huggingface_hub/__init__.py

This command gives you the directory of the package. Under this directory you need to edit hf_api.py file (Line 637).

 self.last_commit = last_commit
        security = kwargs.pop("security", None)
        if security is not None:
            #This is the hot-fix
            security=security['hf']
            security = BlobSecurityInfo(
                safe=security["safe"], av_scan=security["avScan"], pickle_import_scan=security["pickleImportScan"]
            )
        self.security = security
2 Likes

Great it works

Hey guys , try: pip install datasets==2.20.0
This will fix the problem

4 Likes