BERT vs GPT architectural, conceptual and implemetational differences

RajSingh333 · November 26, 2021, 9:27pm

In the BERT paper, I learnt that BERT is encoder-only model, that is it involves only transformer encoder blocks.

In the GPT paper, I learnt that GPT is decoder-only model, that is it involves only transformer decoder blocks.

I was guessing whats the difference. I know following difference between encoder and decoder blocks: GPT Decoder looks only at previously generated tokens and learns from them and not in right side tokens. BERT Encoder gives attention to tokens on both sides.

But I have following doubts:

Q1. GPT2,3 focuses on new/one/zero short learning. Cant we build new/one/zero short learning model with encoder-only architecture like BERT?

Q2. Huggingface Gpt2Model contains forward() method. I guess, feeding single data instance to this method is like doing one shot learning?

Q3. I have implemented neural network model which utilizes output from BertModel from hugging face. Can I simply swap BertModel class with GPT2Model with some class and will it. The return value of Gpt2Model.forward does indeed contain last_hidden_state similar to BertModel.forward. So, I guess swapping out BertModel with Gpt2Model will indeed work, right?

Q4. Apart from being decoder-only and encoder-only, auto-regressive vs non-auto-regressive and whether or not accepting tokens generated so far as input, what high level architectural / conceptual differences GPT and BERT have?

Topic		Replies	Views
Difference between transformer encoder and decoder Models	1	11774	March 12, 2021
BERT and GPT2 embedding questions Beginners	2	1532	December 28, 2022
Is there any difference in the tokenized output if I load the tokenizer from a different pretrained model Beginners	2	382	September 3, 2020
Difference between EncoderDecoder and BertGeneration Beginners	0	220	August 12, 2023
Image captioning decoder Languages at Hugging Face	4	1471	January 6, 2022

BERT vs GPT architectural, conceptual and implemetational differences

Related topics