Malicious code at top of huggingface leaderboard?

rosmine · July 27, 2024, 2:30pm

A pretty simple attack would be to train on the test set to get a model to the top of the leaderboard, then put bad code in a pickle file. Since Safetensors are so common people won’t even check file format anymore.

The model at the top of the MTEB leaderboard seems sketchy MTEB Leaderboard - a Hugging Face Space by mteb
It’s a pickle file, and it’s a 1.5B param model that scores way better than 7B param models.

Also the model card specifically says that you have to use “trust_remote_code=True” dunzhang/stella_en_1.5B_v5 · Hugging Face

Am I being paranoid or is this a problem?

P.S. If the developer is reading this, I’ve sorry for being suspicious, but please use safetensors next time.

Topic		Replies	Views
Understanding where model weights are stored for research project on AI openness Research	3	285	January 31, 2025
MTEB leaderboard page is unusable Spaces	5	247	February 5, 2025
Vulnerability in Safetensors conversion space Research	0	599	March 8, 2024
Leaderboard Details Datasets Beginners	1	71	December 20, 2024
About unsafe file on hugging face Beginners	2	1845	February 22, 2023

Malicious code at top of huggingface leaderboard?

Related topics