Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Multipliers that can do a 64x64->128 bit multiplication every cycle at 5 GHz are large in area and power hungry. They are most expensive traditional hardcoded integer unit (divide is even more expensive, but it is usually implemented as many operations that repeatedly use a divider unit and other units to do a few digits at a time: on essentially all current x86 chips large divides take 40-80 cycles!).

The limitation isn't that bad in practice: it's hard to get more than 1 multiplication per cycle in non-trivial code since you have only 3 more ops to do all the other work - where are the multiplication inputs coming from, for example?

So more than one multiplier probably has a small payoff. Apple chips, for example, are wider, and IIRC they have two multipliers as they can take more advantage.

It's worth noting that 64-bit multipliers are expensive enough that even in AVX-512 Intel still offers zero 64-bit input multiplies in their SIMD unit (they offer a 53 bit one that reuses the FMA hardware though).



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: