Trouble Loading a Gated Dataset For User with Granted Permission

Was instructed to repost here from Trouble Loading a Gated Dataset For User with Granted Permission · Issue #6441 · huggingface/datasets · GitHub.

I have granted permissions to several users to access a private gated huggingface dataset. The users accepted the invite and when trying to load the dataset using their access token they get
FileNotFoundError: Couldn't find a dataset script at ..... . Also when they try to click the url link for the dataset they get a 404 error.

Steps to reproduce the bug

  1. Grant access to gated dataset for specific users
  2. Users accept invitation
  3. Users login to hugging face hub using cli login
  4. Users run load_dataset

Expected behavior

Dataset is loaded normally for users who were granted access to the gated dataset.

Environment info

datasets==2.15.0

Any help would be greatly appreciated!

OK, thanks for the details. The info we were missing in Trouble Loading a Gated Dataset For User with Granted Permission · Issue #6441 · huggingface/datasets · GitHub is that the dataset is private.

As far as I understand, you cannot invite other users to access your private datasets.

To do so, you can create an organization, transfer the private dataset to the organization, and add the users as members of that org.

cc @sbrandeis for visibility

Ah ok then I misunderstood how one should use gated access for datasets. So if I would want to make this work then I would need to make the dataset public and then set gated access. This would allow the public to see the dataset but only users granted access to actually load it - is that correct?

Thank you for your other suggestions!

Kind regards,
Evan

Yes, indeed.

For public gated datasets, users only see that the dataset exists, and can look at the README. They need to accept the conditions of the gate (or have been granted access, if access is manual, as you want to do) to be able to see the dataset viewer (table), to see the list of files, and to access/download the dataset.

More on gated datasets: Gated datasets

1 Like