Which dataset can be used to evaluate a model for sentiment analysis?

Which dataset can we use to evaluate a model for sentiment analysis that can be used as reference?
The model has been fine-tuned over a human-labeled dataset to be trained, so as a reference, is there any metric can be used to evaluate it?

Hello :wave:
Benchmark dataset for sentiment analysis is SST-2. You can use classification metrics like accuracy & f1-score.

We will be releasing something soon for people who want to know about a specific task and how it works. Stay tuned :nerd_face:

1 Like

So, can we use SST-2 as reference when we are going to fine-tune our model over a specific area?

SST-2 is a bit general. If you have a domain-specific use case I’d doubt if there would be a benchmark for that, but your best option is SST-2 imho.

1 Like