LCLMs compress LLM context before decode — 8.8x faster at 16x compression, beating every KV cache method tested. Open-sourced by NYU and Columbia.
Subquadratic, a company developing a novel generative artificial intelligence model, launched today with $29 million in seed funding. The new large language model, dubbed SubQ, uses what the company ...