Demand for data center CPUs has surged, and AI agents are responsible – why the CPU to GPU ratio is more important than ever for hyperscalers

And as agentic AI becomes the norm, there’s a greater need for that CPU backbone to keep things running properly. “Always-on, multi-step reasoning systems don't create brief orchestration bursts around GPU workloads,” said Beckett. “They demand high-core-count CPUs running at sustained loads, continuously. The infrastructure requirement was always structural. It's just now unavoidable.”

When data centers were previously being specced to deliver AI training and inference in the early days of the generative AI revolution, those building them accounted for a gargantuan bias in favor of GPUs. Chatbot conversations required between four and eight GPUs to every single CPU required, because the parallel equations required to meet user requests were GPU-inference heavy.

But as the main use case of AI changes from chatbots to agents, the requirements have also altered. A slight delay for in-depth inference while an AI model ‘thinks’ was seen as an acceptable interface choice. But as agentic AI requires rapid responses and the smooth coordination of tool calls and much more, latency can be a killer. Bolstering CPU counts can help avoid any problems that can quickly spin out into something more significant, breaking the entire agentic stack.

AMD, one of the major manufacturers of CPUs, has seen that shift first-hand. The company had previously forecast that the CPU market would grow at a rate of around 18% annually, but says that the change in requirements has materially changed the market. The rate of growth has now doubled to 35% a year, AMD claims , and will become a $120 billion market by the end of the decade.

“What AMD and Arm's results are telling us is that this is a structural, not cyclical requirement,” said Roger Cummings, CEO of PEAK:AIO, in an interview with Tom’s Hardware Premium. “ In actuality, two structural shifts are driving the demand surge: the rise of agentic AI and the need for deterministic, predictable performance at rack scale.”

AMD posts record first-quarter results, driven by skyrocketing data center CPU demand

Arm's $2 billion in AGI CPU sales are still not enough to penetrate 5% of overall market share, analyst reveals

AMD's market cap hits all-time high, Intel hits 25-year high on Agentic AI's insatiable demand for CPUs

Much of that CPU demand is being driven by hyperscalers, who recognize the integral role that CPUs play in developing the AI clusters that are likely to power the economy in the years to come. “As GPU clusters scale, CPUs are taking on larger roles in orchestration, memory management, networking, storage coordination, and inference handling,” said Jeff Moore, vice president of strategic partnerships at Aegis Cooling, which specializes in next-gen liquid cooling solutions for AI and high-performance computing infrastructure, in an interview with Tom’s Hardware Premium .

There’s a rise in CPU-to-GPU ratios inside AI deployments, said Moore, “particularly because distributed AI workloads generate significant demand for general-purpose compute, memory bandwidth, and east-west data movement.” A recent TrendForce analysis points out that CPUs’ contribution to latency – accounting for nearly 91% of all the delay in responses – is something that AI deployments are trying desperately to counteract.

That shift is now visible not just in financial forecasts, but in the physical design of AI infrastructure itself. In early generative AI deployments, racks were often built around dense GPU configurations, with CPUs effectively treated as supporting components – enough to keep the system running, but not a bottleneck concern. Things are shifting now. “In the media, an AI rack is pictured as a giant box of GPUs,” said Hommer Zhao, founder of OurPCB, a PCB manufacturer with more than 15 years’ experience, in comments to Tom’s Hardware Premium . “But from a hardware design perspective, a GPU is just a very fast, very dumb engine. It cannot talk to the internet or pull data from a hard drive.”

Rather than a single host CPU loosely paired with multiple GPUs, hyperscalers are deploying configurations with higher core-count CPUs, more memory channels, and, in some cases, multiple CPUs per node to keep pace with data movement demands.

There are also thermal and power considerations shaping how racks are populated. High-core-count CPUs, especially those optimized for cloud workloads, are being selected not just for raw performance but for efficiency under sustained load. In liquid-cooled environments, CPUs are increasingly part of the same thermal design envelope as GPUs, rather than an afterthought cooled separately with air.

Recent results from AMD and Arm reinforce the idea that this is not a short-term correction but a deeper architectural shift. AMD has reported strong growth in its data center CPU segment, driven in large part by hyperscaler demand for its EPYC processors , which offer high core counts and memory bandwidth well suited to AI orchestration tasks.

Arm, meanwhile, is benefiting from hyperscalers designing their own custom silicon. “Arm accounts for close to half of all compute shipped to top hyperscalers in 2025, with over a billion Neoverse cores deployed,” said Beckett. “Those are rack-level architectural decisions made years ago.” AWS’s Graviton, Google’s Axion, and Microsoft’s Cobalt chips all reflect a move toward CPU architectures tailored for specific workloads: high-throughput, energy-efficient, and tightly integrated with networking and storage. Arm’s licensing model positions it at the center of this trend, and its recent financial results highlight how significant that hyperscaler-driven demand has become.

Both sets of results point to a change in how CPUs are being valued. In traditional enterprise contexts, the hardware was often general-purpose and interchangeable. In hyperscaler environments, it’s becoming a specialized infrastructure component, tuned for specific roles within AI systems, whether orchestration, inference at the edge, or data preprocessing.

Taken together, the changes in rack design and vendor performance suggest that CPUs aren’t a secondary consideration in AI infrastructure planning any more. Instead, they are becoming a critical factor in determining overall system efficiency and cost.

“The spotlight hasn't revealed something new,” said Beckett. “It's just finally illuminating what serious infrastructure teams never stopped building on.”

Chris Stokel-Walker is a Tom's Hardware contributor who focuses on the tech sector and its impact on our daily lives\u2014 online and offline.\u00a0He is the author of How AI Ate the World, published in 2024, as well as TikTok Boom, YouTubers, and The History of the Internet in Byte-Sized Chunks. ","collapsible":{"enabled":true,"maxHeight":250,"readMoreText":"Read more","readLessText":"Read less"}}), "https://slice.vanilla.futurecdn.net/13-4-24/js/authorBio.js"); } else { console.error('%c FTE ','background: #9306F9; color: #ffffff','no lazy slice hydration function available'); } Chris Stokel-Walker Freelance Contributor Chris Stokel-Walker is a Tom's Hardware contributor who focuses on the tech sector and its impact on our daily lives— online and offline. He is the author of How AI Ate the World, published in 2024, as well as TikTok Boom, YouTubers, and The History of the Internet in Byte-Sized Chunks.

Demand for data center CPUs has surged, and AI agents are responsible – why the CPU to GPU ratio is more important than ever for hyperscalers

Key considerations

Reference reading

More on this site

Leave a Comment Cancel reply

Key considerations

Reference reading

More on this site

Related posts:

Leave a Comment Cancel reply