- 
In GPT-NeoX-20B: An Open-Source Autoregressive Language Model paper, why did the author stated that Rotary embeddings are a form of static relative positional embeddings ? 
- 
In How Self-Attention with Relative Position Representations works | by ___ | Medium , could anyone explain the rationale behind the value of the lookup indices after the 3rd element are all 6 ? 
- 
What is the actual purpose of skewing mechanism ? 

