I am using the list_models function from the Hub client library. I have noticed that the return is a generator where formerly was a list, and that it will be added pagination on version 0.14.
Therefore, I cast the output as recommended (list(iter(list_models))). However, sometimes it returns just 10k repos and sometimes it returns the whole list of model repos. If I use the sort and direction parametres it will always return 10k repos, however if I don’t use them it returns the full length of repos.
Hi Adem, thank you for reporting use this issue. We have indeed implemented pagination as we are getting more and more models uploaded to the Hub. I tried to investigate your issue but couldn’t reproduce the error. Could you please provide more information about:
the version of huggingface_hub you are using (run huggingface-cli env in your terminal and copy-paste the output)
the exact commands that are causing you to retrieve only 10k models
Thank you very much in advance.
I made a quick demo Space here to check if I get the same issues as you but it doesn’t seem to be the case:
Hi @Wauplin, the exact command is: hub_models = list(iter(api.list_models(full=True, cardData=True, fetch_config=True,sort="lastModified", direction=-1, use_auth_token=access_token))), and then it retrieves just 10k models. But if I remove sort and direction parametres, it retrieves all the models (133544).
@ademait Thank you for your feedback. I am able to reproduce your issue and it’s indeed a bug. I opened an internal ticket (internal link) to get it fixed server-side. I’ll keep you posted if I have any updates.
Hi @ademait . I’m getting back to you about this issue. It is now fixed server-side. If you tried again with the same code, you should not be truncated to 10k models anymore. Please let me know if are having any further issues