
The deal is structured as a non-exclusive license of Groq’s technology alongside a broad hiring initiative, allowing Nvidia to avoid triggering a full regulatory merger review while still acquiring de facto control over the startup’s roadmap. GroqCloud, the company’s public inference API, will continue to operate independently for now.
Groq’s primary selling point is the simplicity of its architecture. Unlike general-purpose GPUs, the company’s chips use a single massive core and hundreds of megabytes of on-die SRAM. It has a static execution model, meaning the compiler pre-plans the entire program path and guarantees cycle-level determinism. The result of that is predictable latency with no cache misses or stalls.
In a benchmark of the 70B-parameter Llama 2 model, Groq’s LPU sustained 241 tokens per second, and internally, the company has reported even higher speeds on newer silicon. This throughput is achieved not by scaling up in batch size, but by optimizing for single-stream performance. That’s a fairly major distinction for any workloads that are dependent on real-time response rather than aggregate throughput.
Nvidia’s GPUs, including the upcoming Rubin series , rely on high-bandwidth external memory (GDDR7 or HBM3) and a highly parallel core layout. They scale efficiently for training and large-batch inference, but their performance drops at batch size one. Some of this can be mitigated by software optimization, but Groq’s approach avoids the problem entirely by eliminating external memory latency from the loop.
The acquisition grants Nvidia access to Groq’s entire hardware stack, encompassing the compiler toolchain and silicon design. More importantly, it brings in Groq’s engineering leadership, including founder Jonathan Ross, whose work on Google’s original TPU helped define the modern AI accelerator landscape. With this deal, Nvidia effectively compresses several years of inference-focused R&D into a single integration step.
Groq had emerged as one of the few companies capable of beating Nvidia on certain inference benchmarks , and its customer-facing cloud product was beginning to gain traction. The LPU’s strong performance in small-batch scenarios made it attractive to developers running generative models, a segment Nvidia has only recently begun to target directly.
By bringing Groq’s IP in-house, Nvidia neutralizes that competition and positions itself to offer a full-stack solution across training and inference. The company can now develop systems that pair its high-throughput GPUs with Groq’s low-latency LPUs, leveraging the strengths of each architecture. This will eventually lead to a broader compute portfolio that covers a wider range of model sizes and deployment targets.
Nvidia buys AI chip rival Groq's IP for $20 billion in its biggest deal ever
Key considerations
- Investor positioning can change fast
- Volatility remains possible near catalysts
- Macro rates and liquidity can dominate flows
Reference reading
- https://www.tomshardware.com/tech-industry/semiconductors/SPONSORED_LINK_URL
- https://www.tomshardware.com/tech-industry/semiconductors/nvidia-confirms-20-billion-groq-deal-to-bolster-ai-inference-dominance#main
- https://www.tomshardware.com
- NVIDIA, US Government to Boost AI Infrastructure and R&D Investments Through Landmark Genesis Mission
- 110-pound cast-iron Victorian radiator modded into a gaming PC — massive radiator used for cooling the bottom-mounted PC components
- ChatGPT could prioritize sponsored content as part of ad strategy — sponsored content could allegedly be given preferential treatment in LLM’s responses, OpenAI
- Marine Biological Laboratory Explores Human Memory With AI and Virtual Reality
- Rainbow Six Siege X servers are back online after a hack completely shut down the game — Ubisoft rolling back free ultra-rare skins and billions of credits
Informational only. No financial advice. Do your own research.