Meta’s multi-billion-dollar Graviton deal highlights intensifying CPU shortages in AI infrastructure — the industry signals a shift to Agentic inference workloa

Meta's multi-billion-dollar Graviton deal highlights intensifying CPU shortages in AI infrastructure — the industry signals a shift to Agentic inference workloa

When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works .

Meta already has GPU and accelerator contracts worth hundreds of billions across Nvidia, AMD, Broadcom, Google , CoreWeave, and Nebius, and it went to AWS specifically for general-purpose CPUs. Santosh Janardhan, Meta's head of infrastructure, said in the joint announcement that "diversifying our compute sources is a strategic imperative," and that Graviton allows the company to "run the CPU-intensive workloads behind agentic AI with the performance and efficiency we need at our scale."

Graviton5 , which AWS unveiled at re: Invent in December, packs 192 Arm Neoverse V3 cores on a 3nm process with roughly 180 MB of L3 cache, a fivefold increase over Graviton4. AWS claims a 25% performance lift over its predecessor and 33% lower inter-core latency. AWS vice president Nafea Bshara confirmed that the contract runs for at least three years and that the majority of capacity will be deployed in the U.S.

You may like CPU requirements for AI workloads are multiplying, driving intensifying shortages and price hikes Inside Meta and AMD's $100 billion deal, and why AMD is giving up a slice of the company in return for GPU orders ‘CPUs are cool again,' Intel and AMD reporting spikes in CPU demand due to agentic AI The CPU-to-GPU ratio The meteoric rise of agentic AI is driving notable shifts in CPU-to-GPU ratios. While training LLMs relies on large deployments of GPUs, agentic inference is fundamentally different, involving processes like branching control flow, tool invocation, sandbox execution, validation loops, and orchestration across many concurrent sub-agents. All that work falls on CPUs.

In its recent earnings call, Intel’s CFO David Zinsner said that the ratios of CPUs to GPUs in data centers have already moved from 1:8 to 1:4, adding that as workloads continue migrating towards inference and agentic AI , ratios could converge to 1:1 or even tilt further in favor of CPUs. “As you think about the growth rate now going forward, it’s [CPU demand] going to become a significant part of the AI [total addressable market],” Zinsner said.

Arm has also quantified the rising demand for agentic AI in terms of core counts. At the company’s Arm Everywhere event in March, Arm launched its first in-house silicon product, the 136-core AGI CPU , with Meta as lead partner and customer. Arm CEO Rene Haas told the audience that a typical AI data center today requires around 30 million CPU cores per gigawatt of capacity. With agentic workloads, however, that figure rises to roughly 120 million cores per gigawatt, a fourfold increase driven by agents that run continuously, spawn sub-agents, and generate queries at more than 15 times the rate of human chatbot users.

Meanwhile, AMD CEO Lisa Su said at the Morgan Stanley TMT Conference in March that "we're seeing a significant CPU demand, frankly, as a result of the inference demand picking up." She added that "the CPU portion of the business has actually far exceeded my expectations in terms of demand."

The surge in CPU demand is running into a supply chain that planned for a GPU-dominated world, leading to server CPU lead times stretching to roughly six months , up from about two weeks before the agentic demand spike.

Intel acknowledged on its Q1 earnings call that unmet Xeon demand "starts with a B," referring to billions of dollars in lost revenue, with CEO Lip-Bu Tan saying that “In recent months, we have seen clear signs that the CPU is reinserting itself as the indispensable foundation of the AI era." Revenue would have been higher had Intel been able to produce more chips, the company said: Q1 data center and AI revenue came in at $5.05 billion, up 22% year-over-year.

Server CPU prices have climbed 10% to 20% since March, with analysts expecting a further 8% to 10% increase in the second half of the year. Intel raised prices in both February and March, with a third increase reportedly planned for May, bringing the cumulative hike to roughly 30% above 2025 levels. AMD’s Lisa Su told the Morgan Stanley audience that AMD's own customers described the demand as something that "was perhaps… under-forecasted," adding: "We are in the process of catching up."

Intel, AMD server CPUs reportedly suffering from supply shortages in China

Key considerations

Investor positioning can change fast
Volatility remains possible near catalysts
Macro rates and liquidity can dominate flows

Reference reading

More on this site

Informational only. No financial advice. Do your own research.

Key considerations

Reference reading

More on this site

Related posts:

Leave a Comment Cancel reply