Getting incorrect emotion inferences for sentences from a story using existing models

I wanted to find the emotions in sentences taken from a few stories. I searched for some models and tried a few of them. But unfortunately all of them seem to give quite a lot of incorrect predictions. Only when its very easy to know what an emotion is, is the prediction correct. Perhaps this is due to the fact that the dataset used for training or fine-tuning them were Tweets?

Models Tested : bhadresh-savani/distilbert-base-uncased-emotion, mrm8488/t5-small-finetuned-emotion, mrm8488/t5-base-finetuned-emotion
As an example let’s take an excerpt as follows -

sentence_list = ['Our schedules were opposite: office and restaurant.',
 'But then the world shut down, and the dining rooms were closed.',
 'When Aiesha would normally be waiting tables, we were riding our bikes through Little Rock’s empty downtown.',
 'Logging off from my remote work felt like hearing a school bell ring.',
 'The empty city was our playground.',
 'I remember riding our bikes along the Arkansas River in the dark with a huge smile on my face.',
 'It was the best kind of smile, unseen and electric.',
 'Somehow, I knew happiness within a tragedy.',
 'I found my best friend.']

All the models predicted the emotions as -
[anger, fear, sadness, fear, sadness, fear, joy, joy, joy]

In some models fear was replaced with sadness, but otherwise the same results.

I would have expected to see something like - [sadness, sadness, joy, joy, joy, joy, joy, joy, love] or some variation of it with joy and love and perhaps fear at places where they may be appropriate as well . I do not think that we have anger here at all.

I have tried it on other different excepts and its not usable at all in any case.

Any suggestions on what models I have overlooked that might give me more accurate results? Or any ideas on how to make existing models work better?

EDIT: I would preferably like to use existing models since I don’t have any labeled data of my own. Essentially I would like to use a model to tell me these things more accurately for a downstream task.

Hi @sraj

The best way to get more accurate results from a pretrained model is to fine-tune it, i.e. training it on labeled data if you have any.

If you don’t have any labeled data, then maybe a further investigation into the model outputs may be an option? You could, for example, check if the model confidence is correlated with the accuracy and then introduce a threshold if that’s the case.

For example, if you find that the model is almost always accurate if the confidence level is above 95% then you could filter out the predictions with lower confidence and introduce a human in the loop.

Just an idea, hope that helps.