AMD challenges Nvidia’s DGX Spark with $3,999 Ryzen AI Halo with Windows 11 support — Strix Halo desktop undercuts Nvidia by $700, packs 128GB of unified memory

AMD challenges Nvidia's DGX Spark with $3,999 Ryzen AI Halo with Windows 11 support — Strix Halo desktop undercuts Nvidia by $700, packs 128GB of unified memory

Kunal Khullar is a contributing writer at Tom\u2019s Hardware.\u00a0 He is a long time technology journalist and reviewer specializing in PC components and peripherals, and welcomes any and every question around building a PC. ","collapsible":{"enabled":true,"maxHeight":250,"readMoreText":"Read more","readLessText":"Read less"}}), "https://slice.vanilla.futurecdn.net/13-4-24/js/authorBio.js"); } else { console.error('%c FTE ','background: #9306F9; color: #ffffff','no lazy slice hydration function available'); } Kunal Khullar Social Links Navigation News Contributor Kunal Khullar is a contributing writer at Tom’s Hardware. He is a long time technology journalist and reviewer specializing in PC components and peripherals, and welcomes any and every question around building a PC.

usertests Corsair links are out of stock, but I think there are other 128 GB Strix Halo models out there that are coming in well under $4k, so it's a tough sell. RDNA3.5 isn't as AI-ready as RDNA4/5. I think RDNA4 is already 2-8x faster per CU depending on the data type. So waiting (until 2028?) for an RDNA5-based Medusa Halo could be the right move if you're a dabbler and not a pro(sumer). Reply

daviburg1979 The software support is an essential differentiator here. The latest llm models just work and reliably so on the DGX, while amd's stack is unreliable. All this hardware is pointless if you can't run your app on it. Reply

dva852 RDNA3.5 isn't as AI-ready as RDNA4/5. I think RDNA4 is already 2-8x faster per CU depending on the data type. For LLM use, 8060S' compute capability doesn't matter for decoding (token generation). The bottleneck is the memory bandwidth, which determines token/sec, and Strix is roughly equivalent to Spark. Spark is 2-3X faster than 8060S for context processing. But for normal context size of, say, 8-32K, time-to-first-token (TTFT) is a few seconds, and is trivial compared to decoding time. It's only with larger (64K-256K) context size that TTFT becomes more noticeable. Reviews of DGX Spark at release all knocked the memory bandwidth bottleneck. But 256-bit LPDDR5X has been decided as "good enough" for current-gen local AI, likely for lack of anything faster that's cost-effective. That'll change once LPDDR6 hits next year, with ~50% higher bandwidth. Once the local AI train gets moving, there'll be iterative increases down the line. We're still at the starting line. Getting back to Spark & Strix Halo, the low memory bandwidth means running large models will be slow. A variety of factors come into play, such as sparsity (dense/MoE), backend (CUDA/Vulkan/ROCm), quant type (Q4_K_M/MXFP4/NVFP4), context length, etc. The upshot is that speed ranges from very slow to usable, but never blazingly fast. Hence why Gorgon Halo's 192GB is of marginal benefit, as the bandwidth bottleneck is still the same. You can load larger model, but it'll just run slower. Larger context would be where 8060S' lack of compute will matter. IMO, Gorgon Halo is AMD treading water until Medusa Halo arrives. Comparatively speaking, AMD is still ahead of Intel, which will likely join the local AI APU fray with Nova Lake AX in '27, and Qualcomm would also probably get involved. Fun times ahead. The latest llm models just work and reliably so on the DGX, while amd's stack is unreliable. ROCm 7.1/7.2 is now stable, and faster than Vulkan, but Vulkan is more robust. CUDA still wins on polish and ecosystem support, but AMD has come a long way. For the hobbyist/enthusiast, AMD is a viable option. That said, AMD's $4K box isn't attractive against Nvidia $4.7K DGX Spark. The target audience (devs/biz) isn't price sensitive at these price points, and will opt for Spark over Strix. Reply

usertests dva852 said: That'll change once LPDDR6 hits next year, with ~50% higher bandwidth. Once the local AI train gets moving, there'll be iterative increases down the line. We're still at the starting line. I think it could be around ~100%, if you'll get a 384-bit bus instead of 256-bit (LPDDR6 uses 12-bit subchannels). Very conservative would be (10667 MT/s / 8533 MT/s) * 1.5 = ~1.875. The entry-level LPDDR6 speed is 10667 MT/s according to JEDEC. dva852 said: Comparatively speaking, AMD is still ahead of Intel, which will likely join the local AI APU fray with Nova Lake AX in '27, and Qualcomm would also probably get involved. Fun times ahead. Sweet. Reply

AMD challenges Nvidia’s DGX Spark with $3,999 Ryzen AI Halo with Windows 11 support — Strix Halo desktop undercuts Nvidia by $700, packs 128GB of unified memory

Key considerations

Reference reading

More on this site

Leave a Comment Cancel reply

Key considerations

Reference reading

More on this site

Related posts:

Leave a Comment Cancel reply