
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works .
Nvidia announced BlueField-4 STX at GTC 2026 on March 16, a modular reference architecture for accelerated storage designed to address the data access bottleneck limiting agentic AI inference.
Built around a new storage-optimized BlueField-4 DPU and ConnectX-9 SuperNIC, the platform targets GPU underutilization that occurs when AI agents operating across extended sessions and expanding context windows exceed the throughput of conventional storage paths. Nvidia says STX delivers up to five times the token throughput, four times better energy efficiency, and twice the page ingestion speed compared with traditional CPU-based storage architectures.
The specific issue that Nvidia is targeting with STX is KV cache management. During transformer inference, the attention mechanism computes KV pairs for every token in context, which must be stored and retrieved for each subsequent generation step. But these context windows are growing into the hundreds of thousands of tokens, meaning that the KV cache is outgrowing GPU HBM capacity. The usual fallback is to offload to host DRAM or NVMe storage, but both routes pass through the CPU, adding latency that compounds with context length and stalls GPU execution as data transits.
You may like Nvidia launches Vera Rubin NVL72 AI supercomputer at CES Nvidia's focus on rack-scale AI systems is a portent for the year to come Nvidia Groq 3 LPU and Groq LPX racks join Rubin platform at GTC — SRAM-packed accelerator boosts 'every layer of the AI model on every token' STX bypasses the host CPU by routing data through a dedicated accelerated storage layer via RDMA over Spectrum-X Ethernet. BlueField-4 manages NVMe SSDs directly and handles data integrity and encryption for the KV cache, keeping context accessible at the storage processor rather than transiting the host. The full stack runs on the Vera Rubin platform and integrates the Vera CPU — also announced at GTC on March 16 — alongside ConnectX-9, Spectrum-X Ethernet, DOCA software, and AI Enterprise software. The first rack-scale implementation built on STX is the Nvidia CMX context memory storage platform.
Storage and infrastructure vendors co-designing systems based on STX include DDN, Dell Technologies, HPE, IBM , NetApp, and VAST Data, alongside manufacturing partners AIC, Supermicro, and Quanta Cloud Technology. Meanwhile, eight cloud and AI providers — including CoreWeave, Lambda, Mistral AI, and Oracle Cloud Infrastructure — committed to early adoption for context memory storage. STX-based platforms are expected from partners in the second half of 2026.
"Agentic AI is redefining what software can do — and the computing infrastructure behind it must be reinvented to keep pace," Jensen Huang, founder and CEO of Nvidia, said at GTC. "AI systems that reason across massive context and continuously learn require a new class of storage."
Follow Tom's Hardware on Google News , or add us as a preferred source , to get our latest news, analysis, & reviews in your feeds.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Key considerations
- Investor positioning can change fast
- Volatility remains possible near catalysts
- Macro rates and liquidity can dominate flows
Reference reading
- https://www.tomshardware.com/tech-industry/SPONSORED_LINK_URL
- https://www.tomshardware.com/tech-industry/nvidia-launches-bluefield-4-stx-storage-architecture-for-agentic-ai#main
- https://www.tomshardware.com
- Meta's new MTIA lineup joins hyperscalers' unified push for dedicated inferencing chips — companies diversify AI chips in effort to diversify from sole reliance
- Seagate FireCuda X1070 2TB SSD review: Entry-level hardware meets premium support
- Nvidia GTC 2026 keynote live blog — Vera Rubin GPUs and CPUs, DLSS 5, and the 'future of technology'
- Nvidia launches DGX Station with its bleeding-edge GB300 Grace Blackwell Superchip — now available to order and will begin shipping in the coming months
- Grab 32GB of Corsair DDR5 RAM for just $111 in this epic Newegg combo with the 9850X3D — $1,020 bundle for an AMD gaming PC build includes an Asus X870E-E mothe
Informational only. No financial advice. Do your own research.