How to find a model benchmark-first or task-first

Hello :wave:,

I explored the Open LLM Leaderboard and see that it is organized model-first. If you have a model in mind, you can show certain benchmarks and their scores to evaluate that model.

Is this same information organized benchmark-first or task-first elsewhere?

For example, I need a model that I provides a natural chat and can also calculate numbers accurately. I don’t have a particular model in mind. I know the task I need accomplished, but I don’t know if there is a model that does it well.

The leaderboard’s raw data could be reformatted and combined with other data to accomplish this, but I thought there may be another Space or tool that does this already.

Thank you!