Models for Multilingual Classification Tasks

mox · December 21, 2021, 10:21am

Hello,

I want to build a multilingual classification model (english, hindi, malayalam…) and wanted to ask if anyone has some suggentions which models to use. I would like to explore the performance of different models.
So which models are good in general for the usecase of a text classification with different languages?

Thanks in advance!

nielsr · December 21, 2021, 1:53pm

Some great multilingual models:

These are all encoder-only Transformer models (great for classification, question-answering, NER,…).
CANINE is a relatively new model that is tokenizer-free, meaning it’s a character-level model and does not require an explicit tokenization step.

For summarization/translation/etc. (seq2seq tasks), mT5 is a great model.

mox · December 21, 2021, 2:40pm

Thanks for your answer! I am looking for classification models, so the first ones look good for me!

Topic		Replies	Views
Small miniLM model for multilingual 🤗Transformers	0	326	October 7, 2021
Multilingual NLP with BERT Beginners	0	377	December 14, 2021
Choosing between monolingual and multilingual models Models	0	227	May 23, 2024
Looking for (classifier, dataset) pairs across languages (or just classification datasets) Models	0	311	August 18, 2020
Create a multilingual classifier 🤗 Course Projects	3	1509	October 22, 2024

Models for Multilingual Classification Tasks

Related topics