The rise of local agentic computing faces a brutal reality: rising DRAM prices — RTX Spark, Gorgon Halo chips subject to 63% DRAM contract price hike this quart

It’s all well and good that Nvidia and AMD are releasing machines like the RTX Spark and the Gorgon Halo line-up. However, Samsung, SK hynix, and Micron have all shifted the bulk of their wafer capacity toward high-bandwidth memory for AI accelerators because HBM carries far higher margins than commodity DRAM, and the conventional memory supply has tightened as a direct result of this. HP told investors in February that memory now accounts for roughly 35% of the cost of building a PC , up from 15% to 18% a quarter earlier.

SK Group chairman Chey Tae-won, speaking at Computex 2026 on the show’s official opening day, repeated his position that the shortage will run through 2030 , despite the company's intention to double wafer capacity within the next five years. New fabs from all three makers are under construction, but none will reach volume production before late 2027 at the earliest, and most forecasts now predict a structurally higher price floor that persists even after the acute shortage eases.

The 192GB in a Gorgon Halo box, the 128GB in an RTX Spark or DGX Spark , and the LPDDR5X soldered into every AI laptop announced at Computex all come off wafers the memory makers would otherwise sell as HBM. That’s why Nvidia raised the DGX Spark by $700 in February without changing a single spec, and why component makers have begun passing memory costs through directly. One vendor has even taken an extremely on-the-nose approach of adding a flat memory surcharge to every purchase, and in some cases, smaller buyers are now quoted prices that change by the hour .

A single pool of 192GB would enable an APU to hold a model that would otherwise require a multi-GPU server. While it doesn’t make the model run quickly, dense language model inference reads close to the full set of active weights from memory for every token generated, so generation speed is governed by memory bandwidth divided by the per-token weight footprint, not by idle memory.

Gorgon Halo keeps the same 256-bit LPDDR5X-8000 interface as Strix Halo, which tops out around 256 GB/s in theory and which independent testers have measured closer to 212 GB/s on the GPU. By comparison, the Apple M3 Ultra that AMD and Nvidia are chasing on capacity is rated at 819 GB/s, and an RTX 5090 moves data at 1,792 GB/s.

US RAM crisis hits boiling point as AI mania wipes out all 32GB DDR5 kits under $359

Samsung and SK hynix warn AI-driven memory shortages could last until 2027 and beyond, as HBM demand explodes

Spiralling memory spot prices could trigger 'industry cycle collapse,' report warns

This gap explains why a dense 70-billion-parameter model fully resident on a Strix Halo iGPU lands in the low single digits of tokens per second, regardless of how much headroom the memory pool has. Our own Corsair AI Workstation 300 review found that Nvidia's slightly higher-bandwidth GB10 pulled ahead of Strix Halo as context length grew, for exactly this reason.

Capacity matters most for mixture-of-experts models, which activate only a fraction of their parameters per token and run far faster than their total size suggests, and for long-context agentic workloads, where it’s KVcache rather than model weights that consume memory. It’s these use cases that AMD’s agentic pitch points at, with leaked details on the next-gen Medusa Halo parts showing a move to LPDDR6 and as much as 80% more bandwidth.

Agentic AI is also something of a pricing tool for vendors, beyond describing a workload. A 192GB workstation sold on the promise of running 300-billion-parameter models locally can hold a four-figure price more comfortably than a mini PC sold on cores and clocks, and it justifies loading the most expensive component in the build to its maximum. AMD's Ryzen AI Halo developer box, a 128GB Strix Halo system, opens pre-orders in June at $3,999 through Micro Center, matching the launch price of Acer's GB10-based Veriton GN100 and the original DGX Spark before its increase.

Apple, the one vendor with the scale to hold priority memory allocation, has moved the other way. It pulled the 512GB Mac Studio configuration from sale, raised the price of its 256GB upgrade, and in May removed several more high-memory Mac mini and Mac Studio options as supply tightened.

This shows us beyond doubt that expanding capacity while holding the line on premium pricing is a choice the AMD and Nvidia camps are making, not one that the market is forcing. Whether buyers accept it rests on whether local agentic inference delivers enough value over cloud services to justify the outlay, on machines shipping with memory capacities that outpace the bandwidth that ultimately determines what that memory can do.

Luke James is a freelance writer and journalist.\u00a0 Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.\u00a0 ","collapsible":{"enabled":true,"maxHeight":250,"readMoreText":"Read more","readLessText":"Read less"}}), "https://slice.vanilla.futurecdn.net/13-4-24/js/authorBio.js"); } else { console.error('%c FTE ','background: #9306F9; color: #ffffff','no lazy slice hydration function available'); } Luke James Social Links Navigation Contributor Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.

The rise of local agentic computing faces a brutal reality: rising DRAM prices — RTX Spark, Gorgon Halo chips subject to 63% DRAM contract price hike this quart

Key considerations

Reference reading

More on this site

Leave a Comment Cancel reply

Key considerations

Reference reading

More on this site

Related posts:

Leave a Comment Cancel reply