Relative Position Representation/Encoding for Transformer

kevin998x · February 22, 2022, 8:45am

In GPT-NeoX-20B: An Open-Source Autoregressive Language Model paper, why did the author stated that Rotary embeddings are a form of static relative positional embeddings ?
In How Self-Attention with Relative Position Representations works | by ___ | Medium , could anyone explain the rationale behind the value of the lookup indices after the 3rd element are all 6 ?
What is the actual purpose of skewing mechanism ?

Topic		Replies	Views
Change Positional Embedding in T5 from Relative to Absolute 🤗Transformers	0	682	May 25, 2022
DeBERTa absolute Positions Beginners	2	355	April 15, 2021
Postional Encoding calculation for T5 🤗Transformers	0	187	March 16, 2023
Why we add math to word embedding 🤗Transformers	0	263	March 13, 2022
Positional embedding in GPT-J when using `past_layer` Models	0	407	January 13, 2023