Accessibility of Huggingface's OpenLLMLeaderboard Benchmark Test Sets

Is the new Huggingface OpenLLMLeaderboard’s benchmark test sets accessible to the public? If so, what measures are in place to prevent dataset misuse for fine-tuning and leaderboard manipulation?

Hi @bahgat

Yes, the Huggingface OpenLLMLeaderboard’s benchmark test sets are public. To prevent misuse, they have data usage agreements, activity monitoring, and regular audits.

Have you seen similar measures on other benchmark platforms?

Thanks @LLUMOAI for your reply. I’m not familiar with measures on other benchmark platforms, so I can’t make comparisons. However, I’m interested in learning more about the specific measures you mentioned for the Huggingface OpenLLMLeaderboard.

Could you please provide a source where Huggingface has officially stated these measures? I’d like to see more details about:

  1. The data usage agreements - What exactly do they entail?
  2. Activity monitoring - How is this implemented?
  3. Regular audits - What do these audits involve and how often are they conducted?

Hi @bahgat

For details on Hugging Face OpenLLMLeaderboard measures:

  1. Data Usage Agreements: See Terms of Service and Data Policies.
  2. Activity Monitoring: Check the OpenLLMLeaderboard docs.
  3. Regular Audits: Refer to their Transparency Report.

Hope this helps!