General question about text classification Models

johnhf2 · November 20, 2024, 5:50am

Hello,
I have a general question about choosing a model to run text classification experiment. I will be using 20NewsGroup for my dataset and will it to test out how good the Large Language Model is. Also, I will be using Google Colab to run this experiment.
There are so many different models are on hugging face. I was wondering which model will meet my need.
Sorry, I am a beginner and don’t know that much about different models on hugging face.
Thank you so much for reading my message.

John6666 · November 20, 2024, 6:20am

I’m not familiar with text classification, but it seems that a normal LM would be sufficient for classification with that dataset, rather than an LLM.

https://huggingface.co/models?pipeline_tag=text-classification&library=transformers&sort=trending

Also, the leaderboards is useful for comparing LLM performance.

johnhf2 · November 20, 2024, 4:18pm

Thank you so much for taking the time to write me a message.
Yes, it is true that ML will work. However, I want to do an experiment on using LLM to do text classification on 20NewsGroup dataset. Then, I will compare the result from both Multinormal Naïve Bayes, the ML classifier with the result from LLM.
That is why I will need a LLM to run the dataset.

John6666 · November 21, 2024, 5:00am

I see. So you want to try out LLM experimentally.
In that case, I think you could either refer to the type of leaderboard that compares performance for each task, or simply try out the most popular LLM. The second link below is a list of LLM popularity. It also includes VLM and speech models, but that’s about it.
There are a lot of large models, so I think it would be better to find a series of models that suit you first, and then look for smaller ones.
For simple text classification tasks, you might be able to get away with the smaller 3B or 1B models.
The newer and more famous ones are Qwen 2.5 and Llama 3.2.

johnhf2 · November 21, 2024, 4:06pm

Thank you so much for taking the time to write me again.
I really appreciate your help.

Topic		Replies	Views
Classification Problem - Which class of Hugging Face LLM models should I try? Intermediate	2	4851	September 3, 2023
Small Pre-trained Models Models	0	1509	October 18, 2023
Multilabel classification using LLMs Beginners	12	14378	June 7, 2024
Text generation, LLMs and fine-tuning Beginners	0	1702	December 8, 2022
Products text classification Beginners	0	1134	February 21, 2023

General question about text classification Models

Related topics