Pipeline for sentiment classification

mitra-mir · October 28, 2020, 7:25am

Hey everyone! I’m using the transformers pipeline for sentiment classification to classify unlabeled text. Unfortunately, I’m getting some very awful results! For example, the sentence below is classified as negative with 0.99 percent certainty!
sent = “The audience here in the hall has promised to remain silent.”
sentimentAnalysis = pipeline(task = “sentiment-analysis”)
print(sentimentAnalysis(sent))
# output : {‘label’: ‘NEGATIVE’, ‘score’: 0.9911394119262695}

Do you know what I can do to get better results for unlabeled text?
I actually tried training a large Roberta model on labeled text from kaggle and I’m getting so much better results, but I want to know why the pipeline is performing so bad, and what model it is actually using?

Jung · October 28, 2020, 9:05am

Hi Mitra, I am curious to know the metric performance (e.g. f1) between your trained model and the default pipeline. (How much better is the trained Roberta ?)

BramVanroy · October 28, 2020, 10:10am

The default model for sentiment analysis is a fine-tuned distilbert:

github.com

huggingface/transformers/blob/8065fea87007fbf7542fc060ff8ddd0b5df567da/src/transformers/pipelines.py#L2569-L2579


"sentiment-analysis": {
    "impl": TextClassificationPipeline,
    "tf": TFAutoModelForSequenceClassification if is_tf_available() else None,
    "pt": AutoModelForSequenceClassification if is_torch_available() else None,
    "default": {
        "model": {
            "pt": "distilbert-base-uncased-finetuned-sst-2-english",
            "tf": "distilbert-base-uncased-finetuned-sst-2-english",
        },
    },
},

It’s therefore no surprise that RoBERTa performs a lot better.

That being said, I can understand why the model thinks this is negative. “to remain silent” is often uttered in very negative context (“You have the right to remain silent” when the police arrests someone) Especially for smaller models, this can weigh heavily.

mitra-mir · October 28, 2020, 12:16pm

I’m using these models on unlabeled text so there is no specific metric for evaluating the models on my test set but as I used Roberta, Bert and XLnet I got the best results with only 3 epochs(around 71 percent test accuracy on the kaggle dataset) with Roberta! And then I used this model to classify the unlabeled text and after going over the results I didn’t see anything like what the pipeline was giving me!

mitra-mir · October 28, 2020, 12:24pm

Thanks for responding Bram!
Exactly, but I was curious why the transformers team is not using a model like Roberta for the pipeline when it can give so much better results. I’ve trained the large-Roberta model on the sentiment analysis dataset on kaggle and could get to around 71 percent test accuracy with only 3 epochs and it is giving me very much more rational and accurate results in the unlabeled dataset too.

BramVanroy · October 28, 2020, 4:00pm

Picking a “default” is always difficult. In this particular case, you have to choose on the axis going from “fast” to “accurate”. Larger models, like full RoBERTa models, are more accurate but slower. So for demos or simply as a default value, distilbert is a good choice.

As a user you can still change which model to use, so you can do

pipe = pipeline("sentiment-analysis", "roberta-large-mnli")

mitra-mir · November 3, 2020, 9:17pm

Hi again everyone! I just wanted to thank all you guys for helping me with understanding different models and approaches! And I also want to share with you the kaggle notebook that I wrote on this subject and the video that I made. I’d really like to hear what you think about the whole thing! Thanks again for all your help!

Topic		Replies	Views
How can state-of-the-art classifiers be so wrong? Intermediate	13	1635	May 22, 2023
"I am gay." sentence is classified as NEGATIVE with score 0.99 🤗Transformers	1	519	June 28, 2021
Sentiment Analysis Pipeline on single label function_to_apply not working 🤗Transformers	1	1023	March 17, 2022
Incorrect sentiment-analysis Beginners	0	234	November 14, 2023
Text Classification models give "you should probably train your model" Beginners	0	277	March 27, 2024

Pipeline for sentiment classification

Related topics