RoPE is in LLaMA, Mistral, DeepSeek, Qwen, Gemma — every frontier open model. It is explained geometrically everywhere. It has been traced arithmetically nowhere. Until now.
13 min read
1 day ago
--
Press enter or click to view image in full size
Why This Article Had to Exist
Every explanation of RoPE (Rotary Position Embedding) shows you a circle.
The vector rotates. Very good. Beautiful. A sine wave curves through space.
Not one of them shows you what happens to the number 0.588.
Not one of them computes what cos(1.000) × 0.588 - sin(1.000) × 0.117 equals.
Not one of them writes out the 4×4 rotation matrix and multiplies it by a query vector by hand.
This article does all of that. With the exact Q and K vectors from the companion transformer hand-trace article. Every multiplication written out. The cancellation proof in explicit arithmetic.
By the end, you will understand not just that RoPE encodes relative position, but how the algebra makes absolute positions disappear.
