I am doing sentiment analysis on tweets. I used the roberta-base model. I trained the model on a dataset containing around 90,000 entries. When I predict using the saved model, the results are different when predicting single sentence and when the same sentence is one of the items in a list and all sentences are looped through to predict.
E.g., “hi there” will give a tensor value and a different tensor value when passed in a list like:
[“hi there”, “let’s go out”, “how are you?”]
The difference is so high that a sentence which is positive and correctly predicted as positive when predicted for the single string is predicted as negative when passed in a list.
Is it something that is expected? Or is there anything I need to make sure to avoid this?