“ TurboQuant, QJL, and PolarQuant are more than just practical engineering solutions; they’re fundamental algorithmic contributions backed by strong theoretical proofs. These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds.”
Aren’t polar coordinates still n-1 + 1 for radius for n-dim vector? If so I understand that angles can be quantized better but when radius r is big the error is large for highly quantized angles right? What am I missing?
1. Efficient recursive transform of kv embeddings into polar coordinates
2. Quantize resulting angles without the need for explicit normalization. This saves memory via key insight: angles follow a distribution and have analytical form.
This is the worst lay-people explanation of an AI component I have seen in a long time. It doesn't even seem AI generated.
I think it is though-
“ TurboQuant, QJL, and PolarQuant are more than just practical engineering solutions; they’re fundamental algorithmic contributions backed by strong theoretical proofs. These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds.”
Maybe they quantized a bit too much the model parameters...
Aren’t polar coordinates still n-1 + 1 for radius for n-dim vector? If so I understand that angles can be quantized better but when radius r is big the error is large for highly quantized angles right? What am I missing?
r is a single value per vector. You don't have to quantize it, you can keep it and quantize the billion+ other coordinates of the vector.
I did not understand what polarQuant is.
Is is something like pattern based compression where the algorithm finds repeating patterns and creates an index of those common symbols or numbers?
https://mesuvash.github.io/blog/2026/turboquant-interactive/ has a little visualisation
I like the visualization, but I don’t understand the grid quantization. If every point is on the unit circle aren’t all the center grid cords unused?
i think grid can be a surface of the unit sphere
1. Efficient recursive transform of kv embeddings into polar coordinates 2. Quantize resulting angles without the need for explicit normalization. This saves memory via key insight: angles follow a distribution and have analytical form.
Reminds me vaguely of Burrows-Wheeler transformations in bzip2.