General question about text classification Models

Hello,
I have a general question about choosing a model to run text classification experiment. I will be using 20NewsGroup for my dataset and will it to test out how good the Large Language Model is. Also, I will be using Google Colab to run this experiment.
There are so many different models are on hugging face. I was wondering which model will meet my need.
Sorry, I am a beginner and don’t know that much about different models on hugging face.
Thank you so much for reading my message.

1 Like

I’m not familiar with text classification, but it seems that a normal LM would be sufficient for classification with that dataset, rather than an LLM.

https://huggingface.co/models?pipeline_tag=text-classification&library=transformers&sort=trending

Also, the leaderboards is useful for comparing LLM performance.

Thank you so much for taking the time to write me a message.
Yes, it is true that ML will work. However, I want to do an experiment on using LLM to do text classification on 20NewsGroup dataset. Then, I will compare the result from both Multinormal Naïve Bayes, the ML classifier with the result from LLM.
That is why I will need a LLM to run the dataset.

1 Like

I see. So you want to try out LLM experimentally.
In that case, I think you could either refer to the type of leaderboard that compares performance for each task, or simply try out the most popular LLM. The second link below is a list of LLM popularity. It also includes VLM and speech models, but that’s about it.
There are a lot of large models, so I think it would be better to find a series of models that suit you first, and then look for smaller ones.
For simple text classification tasks, you might be able to get away with the smaller 3B or 1B models.
The newer and more famous ones are Qwen 2.5 and Llama 3.2.