Intel and AMD’s new ACE CPU extensions bring an efficient AI-oriented instruction set to x86 — a new design makes matrix multiplication more power- and density-

Intel and AMD's new ACE CPU extensions bring an efficient AI-oriented instruction set to x86 — a new design makes matrix multiplication more power- and density-

Bruno Ferreira is a contributing writer for Tom's Hardware. He has decades of experience with PC hardware and assorted sundries, alongside a career as a developer. He's obsessed with detail and has a tendency to ramble on the topics he loves. When not doing that, he's usually playing games, or at live music shows and festivals. ","collapsible":{"enabled":true,"maxHeight":250,"readMoreText":"Read more","readLessText":"Read less"}}), "https://slice.vanilla.futurecdn.net/13-4-24/js/authorBio.js"); } else { console.error('%c FTE ','background: #9306F9; color: #ffffff','no lazy slice hydration function available'); } Bruno Ferreira Social Links Navigation Contributor Bruno Ferreira is a contributing writer for Tom's Hardware. He has decades of experience with PC hardware and assorted sundries, alongside a career as a developer. He's obsessed with detail and has a tendency to ramble on the topics he loves. When not doing that, he's usually playing games, or at live music shows and festivals.

usertests What is the point? Does it make sense to do this on the CPU from a TOPS/mm^2 or TOPS/Watt perspective instead of a GPU or NPU? Is it meant to squeeze out a little more performance by enlisting any unused CPU cores? Or is it complementary in another way that isn't obvious? I guess if they pull it off, they can kill off the dedicated NPUs and add more regular CPU cores instead. If that isn't power efficient enough, maybe they can create ACE-optimized cores that are at least vendor-agnostic. Reply

Tech0000 How does ACE extensions relate to the already established AMX extension standard? I see a lot of overlap. Why create a new standard instead of building on and extend AMX and add more capabilities to it. Is ACE just a way to include AMX capabilities in the AVX 10.x standard road map? Also, the article misses to mention that the very important FP4 and FP6 formats are also included in the ACE extension – that makes this even more useful and more complete. Reply

usertests Tech0000 said: Also, the article misses to mention that the very important FP4 and FP6 formats are also included in the ACE extension – that makes this even more useful and more complete. I was wondering about FP4 when I read the list. Reply

JRStern Tech0000 said: How does ACE extensions relate to the already established AMX extension standard? I see a lot of overlap. Why create a new standard instead of building on and extend AMX and add more capabilities to it. Is ACE just a way to include AMX capabilities in the AVX 10.x standard road map? Also, the article misses to mention that the very important FP4 and FP6 formats are also included in the ACE extension – that makes this even more useful and more complete. FP6? OK then! Still waiting on FP3 and FP2, not so sure about FP1, lol. Reply

sygreenblum usertests said: What is the point? Does it make sense to do this on the CPU from a TOPS/mm^2 or TOPS/Watt perspective instead of a GPU or NPU? Is it meant to squeeze out a little more performance by enlisting any unused CPU cores? Or is it complementary in another way that isn't obvious? I guess if they pull it off, they can kill off the dedicated NPUs and add more regular CPU cores instead. If that isn't power efficient enough, maybe they can create ACE-optimized cores that are at least vendor-agnostic. I guess it depends on how its utilized. High end tasks will likely not be the main usage for this tech but maintaining and running low intensity applications would benefit from this. I run a local AI for real-time voicing on a language learning app I created(to avoid crazy high 11 labs or Narekeet fees). It's not super intensive so running off the CPU and using local memory would likely reduce power consumption by a not insignificant amount compared to using the GPU. Reply

usertests sygreenblum said: I run a local AI for real-time voicing on a language learning app I created(to avoid crazy high 11 labs or Narekeet fees). It's not super intensive so running off the CPU and using local memory would likely reduce power consumption by a not insignificant amount compared to using the GPU. If we take the GPU off the table, then we're left with CPU vs. NPU. With >50 TOPS (INT8) NPUs expected to be in most of the x86 mobile chips, as well as Zen 6 and Nova Lake desktop CPUs. I think ACE extensions will land in Zen 7 cores, not sure. If the NPUs aren't more area/power efficient than using slightly enlarged CPU cores supporting ACE, and using the spare area for more CPU cores, then they may disappear after a couple of generations. Reply

thestryker Tech0000 said: How does ACE extensions relate to the already established AMX extension standard? I see a lot of overlap. Why create a new standard instead of building on and extend AMX and add more capabilities to it. Is ACE just a way to include AMX capabilities in the AVX 10.x standard road map? Also, the article misses to mention that the very important FP4 and FP6 formats are also included in the ACE extension – that makes this even more useful and more complete. I believe the intent is for ACE to supplant AMX as ACE was jointly developed by AMD and Intel. It sounds like AMD may have done most of the ground work and then Intel stepped in for implementation. Reply

Key considerations

  • Investor positioning can change fast
  • Volatility remains possible near catalysts
  • Macro rates and liquidity can dominate flows

Reference reading

More on this site

Informational only. No financial advice. Do your own research.

Leave a Comment