A pretty simple attack would be to train on the test set to get a model to the top of the leaderboard, then put bad code in a pickle file. Since Safetensors are so common people won’t even check file format anymore.
The model at the top of the MTEB leaderboard seems sketchy MTEB Leaderboard - a Hugging Face Space by mteb
It’s a pickle file, and it’s a 1.5B param model that scores way better than 7B param models.
Also the model card specifically says that you have to use “trust_remote_code=True” dunzhang/stella_en_1.5B_v5 · Hugging Face
Am I being paranoid or is this a problem?
P.S. If the developer is reading this, I’ve sorry for being suspicious, but please use safetensors next time.