Looking for (classifier, dataset) pairs across languages (or just classification datasets)

Hi everyone,

I’m looking for pairs of (transformers model hub) models and their associated (nlp) datasets across languages. The goal is to be able to try text classification in a bunch of different languages, easily.

(model, dataset) pairs
Here are two examples I’ve found so far:

Multi-lingual classification datasets

Equally good would be link to nlp classification datasets in languages besides French and English. Easiest of all would be a single classification dataset with inputs in many languages (does that exist?). In this case, I could work on training the models myselves (though it’s always nice when they’re trained for you!).

Please let me know! Thanks.