
Standard HBM4E solutions will likely be able to use HBM4 base dies, though some memory makers may migrate to base dies using the N5 or N3P process technologies for higher performance and efficiency.
Virtually all leading DRAM makers have introduced proprietary DRAM solutions with certain levels of customization over the past decade, but none of them have gained traction. Starting with HBM4E, HBM memory will get a separate branch of customized solutions, which is set to feature unique capabilities and proprietary interfaces.
On a high level, C-HBM4E is an HBM4E memory stack with a custom base die. The stack retains standard HBM4E memory devices, which comply with clock and electrical requirements set by JEDEC. However, the base die can now be customized in several different ways, thus shifting emphasis from raw bandwidth to the integration of custom logic directly into memory devices, which can be achieved using several methods.
The easiest way — described by Rambus — is to retain the standard HBM4E interface, alongside built-in custom logic and/or caches on the base die, to add features or performance. As long as the HBM4E protocol with supporting firmware and software stacks is compliant, this may increase the performance of memory subsystems beyond increasing transfer rates or widening I/O.
A more complex method— envisioned by TSMC and Rambus — is to place the HBM4E memory controller and a custom die-to-die interface directly into the logic base die. A large part of the industry's focus is on reducing the number of traces required between the processor and the HBM base die, and a custom D2D interface will do just that. By shrinking the interface width, each memory stack consumes fewer I/O pins, which enables a single SoC to attach a greater number of HBM stacks without increasing package size or complexity.
A custom die made using TSMC's N3P technology would allow packing in an HBM4E memory controller, a custom D2D PHY, and potentially some additional logic. For example, KAIST envisions integration of near memory compute (NMC) processors , which will make at least some C-HBM4E solutions system-on-chips (SoCs) with basic processing capabilities.
If near-memory compute logic is indeed integrated into C-HBM4E, the software stack must evolve to become topology-aware and memory-aware, rather than treating a C-HBM4E stack as 'just' memory. Without changes to toolchains, drivers, and runtimes, near-memory compute becomes invisible silicon—present in hardware, but unused by software.
Runtime systems and compilers will need explicit knowledge of bank structure, channel placement, and in-memory execution units so that workloads can be scheduled where data physically resides, instead of being moved across the fabric. In addition to this, programming models will also need extensions to work with in-memory compute, or multi-tier memory systems in general. Finally, operating systems must support heterogeneous memory domains with non-uniform latency and asymmetric coherence, while profilers must observe and optimize execution occurring inside memory devices.
If the figures published by TSMC and GUC are to be believed, then HBM's raw performance is set to increase by around 2.5 times within the next few years, thanks to HBM4E. This development opens the doors to memory subsystems with a 1 TB capacity and a whopping bandwidth of 48 TB/s. If custom compute logic inside base dies of HBM gets adopted by the industry, this might be the biggest shift in how computers work in decades.
Anton Shilov Social Links Navigation Contributing Writer Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.
Key considerations
- Investor positioning can change fast
- Volatility remains possible near catalysts
- Macro rates and liquidity can dominate flows
Reference reading
- https://www.tomshardware.com/pc-components/dram/SPONSORED_LINK_URL
- https://www.tomshardware.com/pc-components/dram/hbm-undergoes-major-architectural-shakeup-as-tsmc-and-guc-detail-hbm4-hbm4e-and-c-hbm4e-3nm-base-dies-to-enable-2-5x-performance-boost-with-speeds-of-up-to-12-8gt-s-by-2027#main
- https://www.tomshardware.com
- Get a free X870 motherboard and Corsair H115i RGB cooler with a 32GB kit of RAM — avoid high DDR5 prices with $380 worth of free gear
- Nvidia lobbies White House and wins loosened AI GPU export control to China — U.S. lawmakers reportedly reject GAIN AI Act
- At just $60, the Fractal Design Pop Air is one of the best all-round cases you can buy — get a black mid-tower chassis for 40% less
- At just $60, the Fractal Design Pop Air is one of the best all-round cases you can buy — get a black mid-tower chassis for 40% less
- IBM CEO warns that ongoing trillion-dollar AI data center buildout is unsustainable — says there is 'no way' that infrastructure costs can turn a profit
Informational only. No financial advice. Do your own research.