I am trying to add a Huggingface Dataset that combines multiple data sources.
I have dataset files in ./data/ that look like the following:
./data/dataset_A/./data/datasetA/datasetA_train.json./data/datasetA/datasetA_test.json./data/datasetA/datasetA.tar.gz
./data/dataset_B/./data/dataset_B/datasetB_train.json./data/dataset_B/datasetB_test.json./data/dataset_B/datasetB.tar.gz
All the files are stored using git-lfs. If I run git lfs pull --include ./data/*/*.json and git lfs pull --include ./data/*/*.tar.gzfirst, DownloadManager.download(‘data/datasetA/datasetA_train.json’) works.
What if I have not used git-lfs to pull them locally though? Can I use the DownloadManager to load each of these files?