Electra relative position embedding ("relative_key_query")

venetis · September 30, 2023, 10:20pm

Hello,

if we use the approach mentioned in the paper “Improve Transformer Models with Better Relative Position Embeddings” we could theoretically expand the model in lengths of 2048 tokens given there are no absolute embeddings and that the -k and k weights of the window can be duplicated to an arbitary length. Is my assumption correct?

Thank you

Topic		Replies	Views
Relative Position Representation/Encoding for Transformer Research	0	1929	February 22, 2022
Deberta v3 Input length and Absolute positional embeddings Models	0	177	September 30, 2023
ELECTRA Paper Doubts Research	0	217	September 8, 2023
Fine-tuning BERT with sequences longer than 512 tokens Models	7	27577	April 4, 2022
Trying to process longer documents with BERT-based models Intermediate	0	620	March 8, 2021

Electra relative position embedding ("relative_key_query")

Related topics