Skip to content
#repo

tonbistudio/turboquant-pytorch

From-scratch PyTorch implementation of Google's TurboQuant (ICLR 2026) for LLM KV cache compression. 5x compression at 3-bit with 99.5% attention fidelity.

SRC: GitHub AI Trending|BY: tonbistudio
PythonStars284Forks36
VIEW_ON_GITHUBExternal link
INITIALIZING...
Connecting to live updates