Huggingface_hub list_datasets call

Hi everyone,

I’m working with the huggingface_hub client library, which works so smooth!

The reason to create this post is that I noticed that when calling the function list_datasets with the parameter full=True, the siblings field (which has the names of the repository files) is always None.
However, when calling the function list_repo_files with the parameter repo_type set to “dataset” we can retrieve all the files in the repository.

siblings attribute are of the class ModelFile. Is there an ongoing implementation for the DatasetFile or another way to retrieve the filenames of a dataset repository with the list_datasets call?

Thank you in advance

Hello there,

list_datasets function is used to filter all datasets on Hub with a given filter, meanwhile list_repo_files iterates over files of a given repository and thus you get siblings. Since they serve different purposes on different scopes, I’d suggest you to use list_repo_files to list siblings in a given repository.

Hi @merve,

I understand the idea here, but I was wondering is why is there a field in list_datasets while never gives back the file’s list.