Hi !
I have the exact same problem with datasets==2.4.0
.
The output of ls ~/.cache/huggingface/datasets
is
downloads
_home_zramzi_.cache_huggingface_datasets_huggan___parquet_huggan--pokemon-fd0f3e14764c2001_0.0.0_2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec.incomplete.lock
_home_zramzi_.cache_huggingface_datasets_huggan___parquet_huggan--pokemon-fd0f3e14764c2001_0.0.0_2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec.lock
huggan___parquet
But then, when I use load dataset offline (with the jupyter notebook magic for example):
%env HF_DATASETS_OFFLINE=1
from datasets import load_dataset
dataset = load_dataset(
"huggan/pokemon",
None,
cache_dir=None,
use_auth_token=None,
split="train",
)
I get the error:
ConnectionError Traceback (most recent call last)
Cell In [4], line 1
----> 1 dataset = load_dataset(
2 "huggan/pokemon",
3 None,
4 cache_dir=None,
5 use_auth_token=None,
6 split="train",
7 )
File ~/workspace/diffusion-function-measures/venv/lib/python3.9/site-packages/datasets/load.py:1723, in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, ignore_verifications, keep_in_memory, save_infos, revision, use_auth_token, task, streaming, **config_kwargs)
1720 ignore_verifications = ignore_verifications or save_infos
1722 # Create a dataset builder
-> 1723 builder_instance = load_dataset_builder(
1724 path=path,
1725 name=name,
1726 data_dir=data_dir,
1727 data_files=data_files,
1728 cache_dir=cache_dir,
1729 features=features,
1730 download_config=download_config,
1731 download_mode=download_mode,
1732 revision=revision,
1733 use_auth_token=use_auth_token,
...
1244 f"Couldn't find a dataset script at {relative_to_absolute_path(combined_path)} or any data file in the same directory. "
1245 f"Couldn't find '{path}' on the Hugging Face Hub either: {type(e1).__name__}: {e1}"
1246 ) from None
ConnectionError: Couln't reach the Hugging Face Hub for dataset 'huggan/pokemon': Offline mode is enabled.
EDIT
Upon further research, I think this is linked with Datasets created with `push_to_hub` can't be accessed in offline mode · Issue #3547 · huggingface/datasets · GitHub