If I wanted to summarize/classify an entire paragraph into a single output token, could an encoder+decoder model be fine tuned to produce such behavior? For example: I want to classify an email as ‘Work_Related’.
Hi @moyi-druzi, this should be possible. During training, you feed the emails into the encoder and provide the output token(s) as the labels for the decoder. E.g., you train a T5 decoder to generate something like “Work-related” or “Not work-related” and then immediately halt.
You can also use an encoder-decoder model for sequence classification by doing something like:
- Feed the email to both the encoder and the decoder
- In the decoder, feed the hidden states for EOS token into a classification layer
- The classification layer predicts the class (work-related, not work-related)
BartForSequenceClassification is an example of this.
As for just answering the question in the title - no, probably not? Encoder-only models are a lot more common for sequence classification problems. Though it is possible with encoder-decoder models.
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.