Transformers, am i only using a Encoder for Binary Classification?

Hi @unknownTransformer,

BERT uses only the Encoder.

See this page of the docs: https://huggingface.co/transformers/model_summary.html

See also the Devlin paper: https://arxiv.org/abs/1810.04805

Also try Jay Alammar’s blogs, for example this one: alammar.github.io/illustrated-bert/

When you do sentiment analysis, you are using the basic BERT to code your text as numbers, and then you tune a final layer to your task. The Huggingface models such as BertForSequenceClassification already include a “final” layer for you, which is randomly initialized. See this page: https://huggingface.co/transformers/model_doc/bert.html

You will probably want to Freeze most of the BERT layers while you fine-tune the last layer, at least initially. See this post and the reply by sgugger: How to freeze some layers of BertModel

Good luck with it all.

1 Like