Colab cannot find HuggingFace dataset

When I try to run the following code to load a dataset from Hugging Face hub to google Colab, I get an error!

! pip install transformers datasets
from datasets import load_dataset
cv_13 = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="train")
<ipython-input-9-4d772f75be89> in <cell line: 3>()
      1 from datasets import load_dataset
      2 
----> 3 cv_13 = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="train")

2 frames
/usr/local/lib/python3.10/dist-packages/datasets/load.py in dataset_module_factory(path, revision, download_config, download_mode, dynamic_modules_path, data_dir, data_files, **download_kwargs)
   1505                     raise e1 from None
   1506                 if isinstance(e1, FileNotFoundError):
-> 1507                     raise FileNotFoundError(
   1508                         f"Couldn't find a dataset script at {relative_to_absolute_path(combined_path)} or any data file in the same directory. "
   1509                         f"Couldn't find '{path}' on the Hugging Face Hub either: {type(e1).__name__}: {e1}"

FileNotFoundError: Couldn't find a dataset script at /content/mozilla-foundation/common_voice_13_0/common_voice_13_0.py or any data file in the same directory. Couldn't find 'mozilla-foundation/common_voice_13_0' on the Hugging Face Hub either: FileNotFoundError: Dataset 'mozilla-foundation/common_voice_13_0' doesn't exist on the Hub. If the repo is private or gated, make sure to log in with `huggingface-cli login`.

The dataset exists in Huggingface hub and loads successfully in my local Jupiter Lab. What should I do?

Which version of datasets are you using?

cc @lhoestq just in case

1 Like

The Common Voice dataset is a gated dataset, so you need to log in to access it.

Can you try to log in using huggingface-cli login or pass
a HF token load_dataset(..., token=...) ?

1 Like

I logged in using huggingface-cli login and the dataset is currently being downloaded.
datasets version is datasets-2.15.0-py3-none-any.whl.

I logged in using huggingface-cli login and the dataset is currently being downloaded. Thank you!