Dataset loading script test fails

Python: 3.9.7
Datsets: 2.1.0

I’m trying to run the tests for my dataset loading script by following these instructions: Create a dataset loading script. When I run the following test:

RUN_SLOW=1 pytest tests/test_dataset_common.py::LocalDatasetTest::test_load_real_dataset_proto_data

I get the following error:

============================================================================================ test session starts ============================================================================================
platform linux -- Python 3.9.7, pytest-7.1.1, pluggy-1.0.0
rootdir: /home/aclifton/datasets
plugins: anyio-3.5.0
collected 0 items / 1 error                                                                                                                                                                                 

================================================================================================== ERRORS ===================================================================================================
_______________________________________________________________________________ ERROR collecting tests/test_dataset_common.py _______________________________________________________________________________
ImportError while importing test module '/home/aclifton/datasets/tests/test_dataset_common.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../anaconda3/envs/rffp/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_dataset_common.py:27: in <module>
    from datasets.download.download_config import DownloadConfig
E   ModuleNotFoundError: No module named 'datasets.download'
========================================================================================== short test summary info ==========================================================================================
ERROR tests/test_dataset_common.py
============================================================================================= 1 error in 0.09s ==============================================================================================
ERROR: not found: /home/aclifton/datasets/tests/test_dataset_common.py::LocalDatasetTest::test_load_real_dataset_proto_data
(no name '/home/aclifton/datasets/tests/test_dataset_common.py::LocalDatasetTest::test_load_real_dataset_proto_data' in any of [<Module test_dataset_common.py>])

despite having installed datasets. Does anyone have any suggestions about what I might be forgetting or doing wrong? Thanks in advance for your help!

Hi ! You don’t need to use pytest anymore, the datasets-cli test command is enough :wink:

We just updated the documentation to remove the part that mentions pytest

@lhoestq awesome, thank you for your response! Just to be clear, creating the dataset metadata and the dummy data using the datasets-cli commands, one can load the dataset using

from datasets import load_dataset
load_dataset("/path/to/my_dataset")

I am in a position where I am not allowed to upload data to a hosted hub or server.

I checked if the above worked and it did. Thanks again @lhoestq for the guidance!!