Where is the piece of code that computes dataset_info for each dataset uploaded to the hub?

As you may have noticed, each dataset uploaded to the huggingface hub is parsed automatically and the relevant informations are available at an URL like this:

https://datasets-server.huggingface.co/info?dataset=lhoestq/demo1

I’m interested in understanding how this parsing is performed, but I don’t know where to look for the source code.

It doesn’t seem to be in the datasets library. Maybe it’s in the huggingface_hub library ? Or maybe it’s not open-source ?

Any pointer would be appreciated :slight_smile:

1 Like

I don’t know much about datasets, but there are a few projects on github that seem to be related.
Not all of the services inside the HF server are open source, though. I guess that’s just how it has to be for security reasons.

Thankks for the reply.
It looks like it is somewhere in dataset-viewer indeed:

1 Like

It’s on the public part…