Fine-tuning Decoder-only or Encoder-Decoder models for classification

Hello everyone, I’m very super new to fine-tuning LLM models with transformer libray and having some wonder for myself

As I know, the Encoder-only models are usually used for the classification tasks. However, I’m working around to explore myself about the performance of Decoder-Only and Encoder-Decoder mosels on classification tasks.

In the fine-tuning code, I load a model by AutoModelForSequenceClassification instead of AutoModelForCausalLLM. It says that there are some weight added to model. So my first question is: Whether the whole structure of the model changed or several layers for classification added to the model?

My second concen is that in the future is the obstacles usually happen during finetuning?

The last concern is that whether I could generate text like before the finetuning on classification?

I’m just 2 weeks old on this field so sorry for the texty post