Fine-tuning Bert/Roberta for multi-label sentiment analysis

Hi everyone, been really enjoying the content of HF so far and I’m excited to learn and join this fine community.

I noticed that most, if not all, models deployed on the hub have either binary classification or 3 label classification (the 3rd one being “neutral”, in addition to “positive” and “negative”). And so, I had a few follow-up questions:

  1. Why do most models focus on binary classification? Is it that hard to extract more sentiments using the current models?
  2. I want to fine-tune either Bert/Roberta (would love to hear any pros/cons for each of these models) and add my own set of data (say 1000 sentences with 5 different label sentiments) - would I be able to fine-tune a model or am I doomed to start training the models from scratch?

I read the quick tour for the transformer page and it is clearly explained how to fine-tune the Bert mode and add a different number of labels, but I didn’t understand whether I could just add labels and have the model try and guess between the new labels without me adding my own data & how and where I could plug in my own data and then fine-tune the last layer of the network and add, on my own, the final dense layer with the new desired labels.

Looking forward to hearing your comments.

Thank you!