Also need HF to show latency range, memory reqt, et all to pick model for an app from HF model catalog

I would like HF and the community of devs to consider this:

As an example to set the context of the discussion here: Interactive vs batch usage will strongly affect which models are candidates to use by the application developer. Why? Because model latency is critical when selecting a model in the interactive use-case.

Another interactive constraint to model selection is if you are running it on an expensive major cloud system, then memory required and cpu compatibility become critical attributes of the model as well. The major cloud vendors are super expensive to run, forcing everyone to (among a variety of other additional tactics) slim the needed hardware down as much as possible, like single CPU for inference – will it run at all on skinny hardware ? (Y/N)

With this understanding -

HF should be made aware that pretty much every time some users build an app, we need to select a model based on app requirements and hardware availability in our particular application.

It’s really not enough to show only the model attributes comprising what HF is already showing. HF is not showing enough. The consequence is excessive time to do trial and error manually to find the best model at HF.

The efficiency of finding a suitable model in the catalog is possible to be greatly increased with simple upgrades in voluntary telemetry, manual user contributions of such data, and simple upgrades to UX of HF to incorporate in the query and the query results when using the model catalog.

Let us discuss ways to expand and improve the utility of the model search tool.

The efficiency of finding suitable models grows ever more important with 1000s of models now in the HF model catalog.