How to find a model benchmark-first or task-first

mattnorris · February 12, 2024, 7:18pm

Hello ,

I explored the Open LLM Leaderboard and see that it is organized model-first. If you have a model in mind, you can show certain benchmarks and their scores to evaluate that model.

Is this same information organized benchmark-first or task-first elsewhere?

For example, I need a model that I provides a natural chat and can also calculate numbers accurately. I don’t have a particular model in mind. I know the task I need accomplished, but I don’t know if there is a model that does it well.

The leaderboard’s raw data could be reformatted and combined with other data to accomplish this, but I thought there may be another Space or tool that does this already.

Thank you!

Topic		Replies	Views
More insight into benchmarks/leaderboard -- individual task performance by model Beginners	0	567	June 16, 2023
Discovering best models for a task Models	2	255	January 26, 2025
Leaderboard Details Datasets Beginners	1	79	December 20, 2024
The open llm leaderboard has disappeared? Models	2	164	June 26, 2024
Open-LLM-Leaderboard for dummies Intermediate	3	350	December 30, 2024

How to find a model benchmark-first or task-first

Related topics