Specifying download directory for custom dataset loading script

Hi,

by default, the download directory is set to ~/.cache/huggingface/downloads. To change the location, either set the HF_DATASETS_DOWNLOADED_DATASETS_PATH env variable to a different value or modify the DOWNLOADED_DATASETS_PATH variable:

import datasets
from pathlib import Path
datasets.config.DOWNLOADED_DATASETS_PATH = Path(target_path)
datasets.load_datase(...)

However, what you want to do in most cases is not only to change the location of the download directory but to change the entire cache directory to ignore the old cache. You can control this easily with the cache_dir argument in load_dataset (or with the HF_DATASETS_CACHE env variable /datasets.config.HF_DATASETS_CACHE)

1 Like