Gemma 4 made local LLMs feel practical, private, and finally useful on everyday hardware.
turboquant-py implements the TurboQuant and QJL vector quantization algorithms from Google Research (ICLR 2026 / AISTATS 2026). It compresses high-dimensional floating-point vectors to 1-4 bits per ...
capellambse allows you reading and writing Capella models from Python without Java or the Capella tool on any (reasonable) platform. We wanted to "talk" to Capella models from Python, but without any ...