Different prediction tensors on single item vs a list of items

rashub · February 11, 2021, 9:12pm

I am doing sentiment analysis on tweets. I used the roberta-base model. I trained the model on a dataset containing around 90,000 entries. When I predict using the saved model, the results are different when predicting single sentence and when the same sentence is one of the items in a list and all sentences are looped through to predict.
E.g., “hi there” will give a tensor value and a different tensor value when passed in a list like:
[“hi there”, “let’s go out”, “how are you?”]
The difference is so high that a sentence which is positive and correctly predicted as positive when predicted for the single string is predicted as negative when passed in a list.
Is it something that is expected? Or is there anything I need to make sure to avoid this?

lewtun · February 11, 2021, 10:08pm

Hi @rashub can you post the code you’re using to generate the predictions (see here for general advice on getting help)? Even better would be a Google Colab notebook to be able to inspect the inputs / outputs

rashub · February 12, 2021, 2:54pm

Hi @lewtun I followed this colab notebook:

When predicting, if I do the following, this will be correct prediction:
learn.predict(“This was a really good movie, i loved it”)
But if I pass a list for the prediction like the following, the tensor values change:
test = [“This was a really good movie, i loved it”, “Wowwwwww, about an hour ago I finally finished
watching this terrible movie!!!”, “Im still a big IMDb fan, but seriously rethink this rating process
because this movie should be rated no higher than maaaybbbeee like a 3”, “I am disappointed in the
director, Sydney Pollack who gave us the classic Tootsie and other films. This one is a waste of time
and energy.”, "Wow, this movie really sucked down below the normal scale of dull, boring, and
unimaginative films I’ve seen recently. ", “Sorry. Someone has to say it. This really is/was a dull
movie. Worthy perhaps, but dull nonetheless.”, “This is a truly hilarious film and one that I have seen
many times. This is a film you could watch again and again, with a fabulous sound track! One for all
those at school in the 90’s to watch!”]
df = pd.DataFrame(test,columns=[‘information’])
final_res =
for txt in df[‘information’]:
result = learn.predict(txt)
final_res.append(result)

lewtun · February 12, 2021, 4:24pm

Hi @rashub, thanks for sharing the Colab notebook. This seems to be a problem with the fasthugs library, not transformers so perhaps @morgan (the creator) can shed some light on this?

For what it’s worth, I checked that it’s not a fastai problem by feeding your inputs to a ULMFiT classifier from this tutorial: fastai - Text transfer learning (see screenshot below)

In general the predicted values should not change if you decide to pass a batch instead of a single item.

P.S. you can get nicely formatted code in the forum by surrounding the code snippet with 3 backticks “`”

rashub · February 12, 2021, 4:47pm

@lewtun I see, thank you so much for checking and sharing that. Also, I will keep in mind about the 3 backticks, thanks! I will wait for @morgan to check it out.

rashub · February 23, 2021, 11:46pm

Hello,
Is there any update on this?

morgan · March 9, 2021, 1:58pm

Hey @rashub , sorry just seeing this!

FastHugs is quite out of date at this point, I would recommend using the Blurr library for HuggingFace + fastai GitHub - ohmeow/blurr: A library that integrates huggingface transformers with version 2 of the fastai framework

The fastai docs also have GPT2 + fastai example too in case you hadn’t seen:

rashub · March 9, 2021, 5:46pm

Hi @morgan , no problem. Thanks for the suggestions. I will try it out!

Topic		Replies	Views
Different sentiments when texts processed in batches vs singles Intermediate	1	447	July 3, 2022
How to predict in Tensorflow 🤗Transformers	1	2161	February 17, 2021
Predict the output of a text - Sentiment Analysis Models	2	526	July 2, 2022
Self-pretrained model predicts token with -1 index gap 🤗Transformers	0	667	February 22, 2022
Trying to interface with Roberta (Tensorflow) Beginners	0	228	November 5, 2021

Different prediction tensors on single item vs a list of items

Related topics