Searching by type and recognizing the type or pretrained model a model had

Two related questions:

  1. can one know the type of model in the search
  2. can one know the type of a specific model
    I assume this is related both to the site and the API.
    Assuming one wants to find variants of a model, am I correct that the only way to do it through search is by string match? (so if you want bert-base, you will need to filter everything that has bert in it, both roberta which is irrelevant and “berts” which is relevant

As the string match is not a great way to go, I wonder about a second hand question.
Assuming one wishes to extract the model architecture and be sure it is the right one. One can load the original model, then the other model, and make sure they have exactly the same parameter sizes (e.g. in pytorch model.named_parameters()).
Is there a better way to do any of the two?

Hi borgr! Are you looking to do this programmatically or using the Hub website?

Well, for me I will do it programmatically, but it makes a lot of sense on the website too for others doesn’t it? (I might not be interested in ppo models for text classification or in OPT175B to run on my mobile)

I can’t personally speak for the website search box, but you can use huggingface_hub to filter models by architecture:

from huggingface_hub import HfApi
api = HfApi()
models = api.list_models(full=True, fetch_config=True, limit=10)
print([m.config['model_type'] for m in models])
>>> ['bert', 'bert', 'distilbert', 'gpt2', 'distilbert', 'xlm-roberta', 'roberta', 'gpt2', 'bert', 'bert']

Does this fit your usecase?

Unintuitive that “full=True” is not enough to bring the config.

Anyway, it might help (not that T5 11B and T5 small should be in the same category for any user…) although it seems that this is not a consistent trait, only about half (38K out of 58K models) even have a config (mostly if they have a config model_type is in there, only about 200 exceptions).

So, still to pick a certain architecture (comparable models requiring the same infrastructures) the right way is to ignore about a third, and then load each one (heavy) and look at the size of the parameters?

This is because it would be wasteful to fetch if you don’t need it, especially when not limiting the number of results and fetching ~60K models. Not sure about the second part of the question, but you could load the config.json and get the number of layers, the dimensionality of embeddings etc.

It’s normal that not all models have a config since not all models are from the transformers library and the logic will be handled differently outside of the library. See for example this config of a spacy model: config.cfg · spacy/en_core_web_sm at main

1 Like

Thanks, so far I manage with the model types and with using only the fetched config.
The number of layers is inconsistent, each model defines another name for everything (e.g. for layers: n_layers, num_hidden_layers etc.)
I’ll update when I create a function