How to download files stored in repo of dataset script?

Let’s say I want to upload a private datasets to the Hugging Face Hub, under patrickvonplaten/dataset_new/. The dataset contains both files and a dataset_new.py loading script.
So I want to load the dataset with

load_dataset("patrickvonplaten/dataset_new", use_auth_token=True)

The dataset repo will contain some metadata files which are important to split the actual data of the dataset which is downloaded from an external link, let’s call it patrickvonplaten/dataset_new/splits.txt

This means in the dataset script dataset_new.py, first I load & extract the data from the external link:

archive_path = dl_manager.download_and_extract("<external_link>")

Now I also need the splits.txt file - how do I load it?
I can’t just do:

splits_path = dl_manager.download("https://huggingface.co/datasets/patrickvonplaten/dataset_new/raw/main/splits.txt")

since it’s a private dataset and also it’s probably not the cleanest way of loading data.

Any ideas on what should be used here? @lhoestq @mariosasko @albertvillanova

Replying on the behalf of @lhoestq ,

It’s actually as simple as just writing:

splits_path = dl_manager.download("splits.txt")

This will automatically redirect to the repo’s folder and download the splits file.

1 Like