PrismML Emerges From Stealth With $16.25 Million, Claims 1-Bit LLM Matches Full-Precision Performance

PrismML, a startup founded The company's core assertion is that its quantization approach can reduce model weights to single-bit representations while preserving the performance characteristics of much larger, full-precision models. If validated at scale, the technology could substantially lower the compute and memory requirements for running frontier AI models, with implications for on-device inference, edge deployment, and the economics of data center operations. The funding round was reported 1-bit quantization has been an active area of AI research, with Microsoft publishing work on the technique in 2024 and other labs exploring the tradeoffs between compression and accuracy. Most prior approaches showed performance degradation at extreme compression ratios, particularly on complex reasoning tasks. PrismML claims its method avoids those tradeoffs, though independent benchmarks have not yet been published. The startup enters a crowded field of companies attempting to make AI inference more efficient. Competitors include Groq, which uses custom silicon, as well as software-level approaches from companies like Neural Magic and llama.cpp contributors. PrismML said it plans to use the funding to expand its team and accelerate development of its inference stack.

PrismML Emerges From Stealth With $16.25 Million, Claims 1-Bit LLM Matches Full-Precision Performance

GitHub Pauses New Copilot Sign-Ups, Tightens Usage Limits

Tencent Launches International Beta for QClaw AI Agent

Anthropic tests limiting Claude Code access for new Pro subscribers

Meta to capture employee keystrokes and mouse movements for AI training