
As cool as the Intel 8086 ISA accelerator card project is, it won’t work its magic on the classic old apps you might already have in your library – unless you wrote them or have access to the source code.
Brad explains to other X users that “for anything that I build, I can use my own subroutine that uses this hardware multiplier instead of the internal x86 MUL instruction.” But when it comes to pre-compiled apps, they aren’t aware of, and will not make use of, the ISA accelerator card.
Follow Tom's Hardware on Google News , or add us as a preferred source , to get our latest news, analysis, & reviews in your feeds.
Mark Tyson is a news editor at Tom's Hardware. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason. ","collapsible":{"enabled":true,"maxHeight":250,"readMoreText":"Read more","readLessText":"Read less"}}), "https://slice.vanilla.futurecdn.net/13-4-19/js/authorBio.js"); } else { console.error('%c FTE ','background: #9306F9; color: #ffffff','no lazy slice hydration function available'); } Mark Tyson Social Links Navigation News Editor Mark Tyson is a news editor at Tom's Hardware. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason.
bit_user The article said: when it comes to pre-compiled apps, they aren’t aware of, and will not make use of, the ISA accelerator card. Now, make one that fits in a 8087 socket and hack the 8086 microcode to forward imul instructions to it. Then , I'll be impressed. Reply
Ameisenn bit_user said: Now, make one that fits in a 8087 socket and hack the 8086 microcode to forward imul instructions to it. Then , I'll be impressed. I don't believe that's possible. The μcode doesn't have an instruction for forwarding to the coprocessor. Rather, the instruction decoder, IIRC, determines whether an instruction should be handled by it by whether its first byte has high bits matching 11011… Technically, IIRC, both the CPU and FPU decode all instructions – the FPU reads the data bus. The CPU ignores (mostly) ones starting with that pattern, the FPU ignores ones not matching it. So, handling it this way wouldn't be possible – there's no mechanism to "forward" instructions. That's not how the 8086/8087 interact. Even if you could do this, as said the FPU cannot operate on general purpose registers directly – it doesn't directly interact with the CPU like that. You'd need to call FISTP or such afterwards – you'd be better off doing this by adding new FPU instructions. That should work seamlessly, though would still require those instructions to be used. However… x87 already has FIMUL. You could replace IMUL's μcode with an improved version, providing you can fit it into 16 instructions (jumps included)… except that I don't think that the 8086's μcode is mutable. The better question is: what's the advantage of this board over using an 8087 and FIMUL? I suppose that this can run in parallel so you can use 8087 instructions at the same time, but that only seems marginally useful. Reply
bit_user Ameisenn said: I don't believe that's possible. The μcode doesn't have an instruction for forwarding to the coprocessor. Rather, the instruction decoder, IIRC, determines whether an instruction should be handled by it by whether its first byte has high bits matching 11011… Yeah, my comment was only half-serious. I'm not surprised to hear it couldn't work. Disappointed, but not surprised. : ) Ameisenn said: Technically, IIRC, both the CPU and FPU decode all instructions – the FPU reads the data bus. The CPU ignores (mostly) ones starting with that pattern, the FPU ignores ones not matching it. So, handling it this way wouldn't be possible – there's no mechanism to "forward" instructions. That's not how the 8086/8087 interact. Thanks for the insight! I never actually knew how they interacted, but that makes total sense! Ameisenn said: Even if you could do this, as said the FPU cannot operate on general purpose registers directly – it doesn't directly interact with the CPU like that. Yes, that makes sense. I did a bit of MMX programming and I remember it was annoying to move data between the 32-bit x86 GPRs and the x87 registers. Now, this gives me a greater sense of why they weren't more closely integrated. Ameisenn said: The better question is: what's the advantage of this board over using an 8087 and FIMUL? I suppose that this can run in parallel so you can use 8087 instructions at the same time, but that only seems marginally useful. The 8087 was no speed demon either, I think. According to this, a single-precision multiply took 95 cycles: https://datasheets.chipdb.org/Intel/x86/808x/datashts/8087/205835-007.pdf Worse yet, addition & subtraction weren't much faster. According to this, imul took 128-154 cycles, but then register addition only took 3 cycles: https://www.oocities.org/mc_introtocomputers/Instruction_Timing.PDF So, I think abusing the 8087 for running all your 16-bit integer arithmetic wouldn't be a net win. Reply
Ameisenn bit_user said: Yeah, my comment was only half-serious. I'm not surprised to hear it couldn't work. Disappointed, but not surprised. : ) Thanks for the insight! I never actually knew how they interacted, but that makes total sense! Yes, that makes sense. I did a bit of MMX programming and I remember it was annoying to move data between the 32-bit x86 GPRs and the x87 registers. Now, this gives me a greater sense of why they weren't more closely integrated. The 8087 was no speed demon either, I think. According to this, a single-precision multiply took 95 cycles: https://datasheets.chipdb.org/Intel/x86/808x/datashts/8087/205835-007.pdf Worse yet, addition & subtraction weren't much faster. According to this, imul took 128-154 cycles, but then register addition only took 3 cycles: https://www.oocities.org/mc_introtocomputers/Instruction_Timing.PDF So, I think abusing the 8087 for running all your 16-bit integer arithmetic wouldn't be a net win. Well, you could always make a better-yet-compatible 8087 with a faster FIMUL 🙂 . You could call it the 8087+ or something. Really, this wouldn't be hard to do – you could probably even wire up a modern Cortex-M with an FPU to do it and you'd still destroy it performance-wise, though you gotta handle the oddities of x87 like 80-bit ops. You could even add your own custom instruction like FIMULEX which takes two operands and so can bypass the stack operations altogether. Effectively, you'd end up making a new prefix byte to go after the first, specifying "fancy non-stack instructions" or such. There's just no way to handle this well on the CPU-side alone unless, as said, you could write a better version of IMUL in <= 16 μcode instructions, and then find a way to update it on the chip. That being said, the 8086 was designed to allow interfacing with multiple other coprocessors like this – either closely or loosely-coupled. They appear to be using a form of the loosely-coupled model here. If you really want to tinker with arbitrarily adding coprocessor instructions, you should tinker with MIPS. MIPS coprocessors generally have instructions to transfer between coprocessor and general purpose registers. Reply
bit_user Ameisenn said: If you really want to tinker with arbitrarily adding coprocessor instructions, you should tinker with MIPS. MIPS coprocessors generally have instructions to transfer between coprocessor and general purpose registers. Oh, I just muse about this stuff. I like to think about computer architectures, but I limit all my real tinkering to just software. Mostly CPU and a tiny bit of GPU. About as adventurous as I'd likely get would be trying to program some NPU cores, if I could find enough tools and docs to make the exercise worthwhile. As for hardware, I like to do a little bit of testing & tuning. Like empirical approaches to CPU cooling or SSD configurations. Sometimes, I use data to make quantitative arguments and speculation, but that's somewhat labor-intensive for the payoff. Anyway, the main thing I got from this article was just how expensive imul on the orignal 8086 was. Given that it had only like 29k transistors, it's not surprising they couldn't afford to build a hard-wired multiplier. I once programmed an embedded RISC core that didn't even have a microcoded multiply instruction, since it didn't even have microcode. Some of the first things I wrote for it were assembly language routines for doing multiplication and division. It also didn't have a stack, so stack macros came even before those. Reply
bit_user JeffreyP55 said: I am not going through that crap ever again! My grandkids are retro everything. Retro can be a cool way for kids to learn stuff. With hardware, you can actually do retro things on a breadboard, whereas modern chips need everything to be done on printed circuit boards. Retro-computing also still plays a role in recovering data from old computer systems or figuring out how to interpret data that people have already recovered. Furthermore, it keeps knowledge around that might one day be useful in a post-apocalyptic scenario. I'm not personally engaged in it, but I can definitely see value in the endeavor. Reply
JeffreyP55 bit_user said: Retro can be a cool way for kids to learn stuff. With hardware, you can actually do retro things on a breadboard, whereas modern chips need everything to be done on printed circuit boards. Retro-computing also still plays a role in recovering data from old computer systems or figuring out how to interpret data that people have already recovered. Furthermore, it keeps knowledge around that might one day be useful in a post-apocalyptic scenario. I'm not personally engaged in it, but I can definitely see value in the endeavor. A Atari 2600+ won't be much help. Hopefully the grand kids will show some interests in more technical fields like their Grand dad. I might know at little something. 🙂 I would be like a fish out of water with today's job market. Reply
Key considerations
- Investor positioning can change fast
- Volatility remains possible near catalysts
- Macro rates and liquidity can dominate flows
Reference reading
- https://www.tomshardware.com/pc-components/cpus/SPONSORED_LINK_URL
- https://www.tomshardware.com/pc-components/cpus/hobbyist-builds-a-homebrew-intel-8086-isa-accelerator-card-makers-project-improves-integer-multiplication-on-these-retro-systems-by-250-percent#main
- https://www.tomshardware.com
- National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources
- AMD's upcoming Ryzen 9 9950X3D2 listed around $1,000 at several retailers across Canada and the UK — New flagship dual-cache CPU might demand a hefty premium
- PlayStation 3 emulator makes Cell CPU 'breakthrough' that improves performance in all games — 'All CPUs can benefit from this, from low-end to high-end!' says R
- Intel's upcoming 42-core Nova Lake SKU allegedly upgraded to 44 cores — New config frees up 6P+12E tiles that could trickle down as locked bLLC variants
- Advancing Open Source AI, NVIDIA Donates Dynamic Resource Allocation Driver for GPUs to Kubernetes Community
Informational only. No financial advice. Do your own research.