Multilabel classification using LLMs

Utkarsh-Tiwari · April 1, 2024, 2:13pm

I needed to know what’s the best way to finetune LLM models for multiclass classification tasks where there are more than 100 classes. I assume that ‘Text Generation’ is the main functionality of these LLMs and most of the coding examples and documentations show the ‘Text Generation’ as the example only.

I know that I can generate those labels by finetuning these ‘Text Generation’ models on my dataset, but this will only train the LLMs on the labels that are present in the train dataset and still there are labels which are missing from my train dataset. The ideal scenario would be to have datasets for all the classes in my train dataset but it’s not the case as of now. Also, there might be instances where these text generation models would generate new classes of their own which doesn’t exist at all.

I have seen on HF that there are model classes of these LLMs like ‘LlamaforSequenceClassification’ with sequence classification heads but haven’t found any example of their implementation.

I know that BERT models are used for the Sequence Classification tasks but can I do it using any of the LLMs? If possible, please share any example for the same.

MattiLinnanvuori · April 8, 2024, 2:50pm

Below I attach an example of multilabel classification with an LLM model. https://github.com/NielsRogge/Transformers-Tutorials/blob/master/BERT/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb

Utkarsh-Tiwari · April 9, 2024, 7:52am

Thank you for your reply. I wanted to know if I can use the LLMs like Llama2, Mistal, Phi2 (Decoder only models) for text classification just like you did here with BERT (which is an encoder model).

In the Llama2 documentation on HF, they have mentioned a class -
LlamaForSequenceClassification

But I am not sure how to utilize this class for sequence classification purpose.

swtb · April 9, 2024, 9:05am

I have used Mistral for NER which is a word level classification task. I used RAG to retrieve similar queries (and their labels) and show the decoder model what I expect the output to be. You can then pass the query and it should classify it according to the examples you have shown it.

I will say while this works…it is certainly more native to do this with and encoder like BERT.

juhoinkinen · April 9, 2024, 9:24am

If non-LLM approaches suit you, you could consider Annif, see the demo at annif.org.

Annif is developed for extreme multilabel classification of texts, so it is most suitable if there are thousands or tens of thousands labels to choose from.

but this will only train the LLMs on the labels that are present in the train dataset and still there are labels which are missing from my train dataset.

Annif utilizes two kinds of algorithms in its backends: associative and lexical. An associative algorithm suggests only the labels it has seen on its training set, whereas a lexical algorithm can suggest any label from the vocabulary, because it learns features on based on word positions etc. See Backend: MLLM · NatLibFi/Annif Wiki · GitHub for more details.

(That said, Annif could in future use some LLM or BERT for well-performing zero-shot analysis, which is why I’m interested in these topics.)

MattiLinnanvuori · April 9, 2024, 12:14pm

You can utilise a decoder-only model like LlamaForSequenceClassification like in the link above. If HuggingFace has created a sequence classification class, you can use it like above.

Utkarsh-Tiwari · April 11, 2024, 9:48am

Thank you, I will try Annif and see if this helps for my use case.

Utkarsh-Tiwari · April 11, 2024, 9:52am

I guess that using RAG for text classification would not generate most accurate predictions as compared to fine tuning the LLM on my dataset.

Utkarsh-Tiwari · April 11, 2024, 9:53am

I will figure this out and will post here if it works or not.

Utkarsh-Tiwari · June 6, 2024, 11:30am

I tried using the Llama for Sequence Classification head, but it turns out that the predictions were not upto the mark and worse than the BERT classification model predictions.

It’s better to use the AutoModelForCausalLM class for finetuning the decoder only LLMs as they generate better predictions than BERT and LLM model with sequence classification heads.

VidaRoha · June 6, 2024, 10:58pm

Hi, I am now having very similar task. I am not familiar with AutoModelForCausalLM but I hope you succeed in what you are doing. If it’s okay, please let me know if it works out.
I am trying to use RAG+ GPT approach. Whether it works out or not, I’ll post the comment here.

Utkarsh-Tiwari · June 7, 2024, 7:01am

Refer this: Google Colab

This notebook is available in the Hugging Face documentation of Llama2 model ( Llama2 (huggingface.co)), and I think this is the most appropriate way for text classification if you want to use Decoder only models for text classification tasks.

RAG and AutoModelForSequenceClassification head on top of these decoder only LLMs, won’t give the best results for text classification.

system · June 7, 2024, 7:02pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multi-class Classification Basics Beginners	4	4579	August 24, 2021
Fine-tuned Llama2 model for text classification generating new instances Beginners	0	1415	February 26, 2024
Text classification and generation from the same model Beginners	1	825	July 27, 2023
Best solution to train multiclass model Beginners	0	309	March 30, 2022
Good models for few-shot multi-label text classification Beginners	0	1934	March 23, 2022

Multilabel classification using LLMs

Related topics