Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
ack_complete
on Aug 13, 2023
|
parent
|
context
|
favorite
| on:
Popcount CPU instruction (2019)
There is actually such an instruction, psadbw (_mm_sad_epu8). It's not any noticeably faster for the scenario you describe since it only affects the outer loop, but it does avoid needing the popcnt instruction since psadbw only requires SSE2.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: