Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's the code in godbolt:

gcc: https://godbolt.org/z/6xG5dKjj9

clang: https://godbolt.org/z/Mh9zozjvK

I'm no asm expert, but it doesn't look like a lot of vector instructions in the gcc compilation of this, while the clang compilation seems to have more calls with the 128-bit xmm registers (at least on x86_64.) You can also just see visibly how many more instructions the gcc version outputs.



Thank you! Indeed GCC does not use SIMD here unless you set -O3 (... I seem to remember this enables some vectorization?) or allow it to use AVX with -mavx or -march=x86-64-v3. For some reason I’m unable to get it to use plain SSE (always available on x86-64) with any -mtune setting or even with -march=x86-64-v2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: