Nvidia CEO Jensen Huang explains why SRAM isn’t here to eat HBM’s lunch — high bandwidth memory offers more flexibility in AI deployments across a range of work

Nvidia CEO Jensen Huang explains why SRAM isn't here to eat HBM's lunch — high bandwidth memory offers more flexibility in AI deployments across a range of work

"You might be able to take one particular workload and push it to the extreme," Huang said. "But that 10% of the workload, or even 5% of the workload, if it’s not being used, then all of a sudden that part of the data center could have been used for something else." In other words, Huang is arguing that peak efficiency on a single task matters less than consistent usefulness across many.

The original question also touched on open AI models and whether they might reduce Nvidia’s leverage over the AI stack . The suggestion was that open models, combined with SRAM-heavy designs and cheaper memory, could reduce reliance on Nvidia’s most expensive GPUs and improve margins across the stack.

While Huang has praised open models publicly and Nvidia has released its own open weights and datasets, his CES remarks made clear that openness does not eliminate infrastructure constraints. Training and serving competitive models still require enormous compute and memory resources, regardless of licensing. Open weights do not eliminate the need for large memory pools, fast interconnects, or flexible execution engines; they just change who owns the model.

This is important because many open models are evolving rapidly and, as they incorporate larger context windows, more experts, and multimodal inputs, their memory footprints will grow. Huang’s emphasis on flexibility applies here as well; supporting open models at scale does not reduce the importance of HBM or general-purpose GPUs. In many cases, it increases it.

The implication is that open source AI and alternative memory strategies are not existential threats to Nvidia’s platform. They are additional variables that increase workload diversity. That diversity, in Nvidia’s view, strengthens the case for hardware that can adapt rather than specialize.

Ultimately, Huang’s CES comments amount to a clear statement of priorities. Nvidia is willing to accept higher bill of materials costs, reliance on scarce HBM , and complex system designs because they preserve optionality. That optionality protects customers from being locked into a narrow performance envelope and protects Nvidia from sudden shifts in model architecture that could devalue a more rigid accelerator lineup.

This stance also helps explain why Nvidia is less aggressive than some rivals in pushing single-purpose inference chips or extreme SRAM-heavy designs. Those approaches can win benchmarks and attract attention, but they assume a level of workload predictability that the current AI ecosystem no longer offers.

Huang’s argument is not that specialized hardware has no place. Rather, it is that in shared data centers, flexibility remains the dominant economic factor. As long as AI research continues to explore new architectures and hybrid pipelines, that logic is unlikely to change.

For now, Huang seems confident that customers will continue to pay for that flexibility, even as they complain about the cost of HBM and the price of GPUs . His remarks suggest the company sees no contradiction there. That view may be challenged if AI models stabilize or fragment into predictable tiers, but, right now, Huang made it clear that Nvidia does not believe that moment has arrived yet.

Luke James Social Links Navigation Contributor Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.

Key considerations

  • Investor positioning can change fast
  • Volatility remains possible near catalysts
  • Macro rates and liquidity can dominate flows

Reference reading

More on this site

Informational only. No financial advice. Do your own research.

Leave a Comment