Learning Vector Quantization Algorithm

Tether is shipping TurboQuant KV-cache quantization with Vulkan support into its QVAC SDK

Tether successfully integrated Google’s TurboQuant into the inference engine of its local AI framework, QVAC. It is the ...

Morning Overview on MSN

Google unveiled TurboQuant, a method that cuts the memory bottleneck slowing large AI models

Companies running large language models face a persistent bottleneck: the memory consumed by key-value caches during ...

manilatimes

Nota AI Has Two MoE Quantization Papers Accepted at ICML 2026 Workshop, Demonstrating Global Competitiveness in Large-Scale AI Optimization

Two papers on MoE-specific quantization algorithms accepted at a workshop held in conjunction with ICML 2026 Recognition follows Nota AI's overall win at the NVIDIA Nemotron Hackathon Strengthening ...

IEEE

Power-Efficient Analog Hardware Architecture of the Learning Vector Quantization Algorithm for Brain Tumor Classification

Abstract: This study introduces a design methodology pertaining to analog hardware architecture for the implementation of the learning vector quantization (LVQ) algorithm. It consists of three main ...

VentureBeat

Show inaccessible results

Tether is shipping TurboQuant KV-cache quantization with Vulkan support into its QVAC SDK

Google unveiled TurboQuant, a method that cuts the memory bottleneck slowing large AI models

Nota AI Has Two MoE Quantization Papers Accepted at ICML 2026 Workshop, Demonstrating Global Competitiveness in Large-Scale AI Optimization

Power-Efficient Analog Hardware Architecture of the Learning Vector Quantization Algorithm for Brain Tumor Classification

Cohere cracks lossless quantization and native citations with first full Apache 2.0 licensed open model Command A+

Google’s new AI compression could cut demand for NAND, pressuring Micron

Micron Slides 5% as Google’s AI Memory Algorithm Sparks Fresh Fears Across the Semiconductor Sector

Python implementation of the TurboQuant and QJL vector quantization algorithms.

Google's new TurboQuant algorithm speeds up AI memory 8x, cutting costs by 50% or more