Thanks for checking this @GalacticKip7. I actually later found that simply rotating half is also a correct form of rotary embedding (see the following vector-vector multiplication-addition form and the equivalent matrix-vector multiplication form). As long as the matrix (R) satisfies the third equation for any q and k vectors, it’s a valid form of rotary embedding. The only caveat is use the same form for both training and inference.
5 Likes