Low Level Bit Hacks You Absolutely Must Know

throwaway2016a · on May 11, 2018

I was asked #1 "Check if the integer is even or odd" in an interview once. Not worded quite that way but that was the answer they were looking for.

12 years later not a single person I have hired could correctly answer that question[1] and most of them (almost all) have been great productive software developers. So "absolutely must know" is entirely dependent on the type of software you write.

[1] I don't ask it in the interview, I ask it after they are hired and have been with us a few weeks just for fun.

ladberg · on May 11, 2018

What did they usually say? I know the bitwise trick, but I would say "x % 2 == 0" if asked because it makes a lot more sense and will (almost) always compile to the exact same code.

carlmr · on May 11, 2018

Exactly, readability by other engineers + compiler optimizations is usually better than some bithacking that may or may not help in terms of speed and definitely doesn't help with understanding. (In this case maybe simple enough, but still).

mbrumlow · on May 11, 2018

I am not trying to start a argument but it is a bit sad that these things are called "hacks", in the article, and by others.

Some would call it math.

But I guess that shows you the state of our art.

blackflame7000 · on May 12, 2018

We live in the era were you can get a Billion Bytes for a dollar. We also live in an era where finding someone who can manipulate bits to improve performance is either 1)A Micro-Op 2) Extremely fringe work, or 3) prohibitively expensive to be worth it.

carlmr · on May 14, 2018

Don't get me wrong. I find most of these "bit hacks" to be quite simple individually. But they add a layer of necessary comprehension.

In most cases you won't see these alone. They will be surrounded by tens of thousands of other lines of code. And where there is one "hack" there are usually others. And at some point you end up with a bug somewhere because one of these smart little hacks is failing. You don't know which one.

This does create unnecessary problems for maintenance in the future. So use it, but use it only when absolutely necessary. Not because you think the code will run faster.

It's also important to note that many things people think make the code faster, don't. Like vector vs. linked list access. Intuitively the linked list seems much simpler for inserting an element. But in almost all cases the vector is faster, because of cache locality.

So what I'm saying is. Do things that make the code probably faster like using a vector, if they don't cloud understanding. If they do, leave them out until you have a reason to use them. The only reason to use micro-optimizations like these is if you measured execution time and know there is a performance bug in this place in the code which your "hack" could resolve.

No measurement -> no optimization. Because the performance problems are almost always in different places than the average developer assumes.

And then add a comment explaining what you did to a 5 year old.

DoofusOfDeath · on May 11, 2018

> I know the bitwise trick

Only if you make certain assumptions about how the integer is represented, I reckon.

projektfu · on May 11, 2018

Indeed, a ones-complement negative number would not have this bit property.

Hello71 · on May 11, 2018

In fact, newer gcc versions are unable to optimize (x & 1) == 0 at level -O1, and require -O2. I assume that the code emitted at -O2 and -O3 for both constructs is the more efficient version than the literal translation, even though they are the same number of instructions.

vardump · on May 11, 2018

Apparently not true, at least in a simple function.

https://godbolt.org/g/hfpHz7

  int isEven(int number)
  {
      return number & 1;
  }

gcc 7.3, -O1, -O2 and -O3 (identical codegen for all optimization levels):

  isEven(int):
    mov eax, edi
    and eax, 1
    ret

Hello71 · on May 11, 2018

that function checks if a number is odd, not even.

https://godbolt.org/g/7Khcg9

vardump · on May 11, 2018

My bad. Didn't even check it at any point.

Anyways, you still prove my point. All those three are pretty close.

-O1 sete instruction is very slightly unoptimal, but even it is just a fractional clock cycle.

BearOso · on May 11, 2018

I highly doubt it. That’s 1:1

  test eax, 1
  ; use result

Hello71 · on May 11, 2018

gcc -O2, clang, and icc all produce the sequence:

    is_even(int):
      mov eax, edi
      not eax
      and eax, 1
      ret

I am far from a x86 expert, but I assume this version avoids contention on the flags registers and increases pipelining.

edit: my mistake, icc and clang produce the equivalent sequence "not edi; and edi, 1; mov eax, edi; ret".

throwaway2016a · on May 11, 2018

This is a question I give people who were already hired out of curiosity and I explicitly say not to use modulus. I ask a similar question in interviews but don't exclude modulus and most answer using modulus.

naasking · on May 11, 2018

Well, you need to use modulus if you want to be 100% portable.

dmkolobov · on May 11, 2018

always been fond of `even x = x == ((x >> 1) << 1)`

OskarS · on May 11, 2018

Come on, really? "Knowing the very basics of binary numbers" isn't some obscure super-technical skill when it comes to programming, it's fairly fundamental. If they don't know that, then how can they possibly know how to use bit-masks, for instance?

Like, I get it when people say "most of the stuff you learn getting a CS degree has very limited usefulness day to day". Sure, you're not going to be implementing red-black trees or skip lists in your day-to-day, but you can't possibly claim that having a basic idea of how the bitwise operators work (or the details on how numbers are represented in binary) is in the same category.

juliangoldsmith · on May 11, 2018

>how can they possibly know how to use bit-masks, for instance?

Given that most programming jobs are web development jobs (I think, I don't have numbers to back that up), what makes using bit-masks important? That is, why would someone working on a web application need to know how to use them?

phaemon · on May 11, 2018

So that they understand how a network mask works? I'm assuming if you're working on the web you ought to know something about networks, right?

juliangoldsmith · on May 11, 2018

Why would a regular web developer need to know anything about network masks? Those come up in administration, but generally not in development.

tonyarkles · on May 11, 2018

I might be asking a bit much here, but I generally expect someone who's developing something that's going to run on the Internet to have some understanding of how the Internet works. Like... you don't need to understand the intricacies of BGP, but some idea of how a request from your web browser leaves your home network and comes back is strongly appreciated.

My biggest reasoning for desiring that skill is debugging. If you build a system and it doesn't work right for whatever reason, it's good to understand why that might be. Also, having some appreciation for that stuff really helps you make good decisions (but why can't I make 200 HTTP requests while loading my page?!)

dahfizz · on May 11, 2018

Most programmers nowadays just write frontend javascript and have no idea how anything actually works.

sidlls · on May 11, 2018

There are relatively few programming jobs that actually require bit manipulation at that level. I've been deep into bit manipulation twice in my career (well more than 10 years now): network programming for a real time system and game development.

For the machine learning and infrastructure work I've done the last 5 years or so I haven't had to do even a single bitmask, shift, etc. directly. Of course we use these things liberally (e.g. hash functions) but it's hardly necessary to write the code itself very often, even in very sophisticated applications.

MarkMMullin · on May 11, 2018

Huh - I do ML and a lot of bit twiddling with trained feature detector outputs - intake goes into a stage, feature detectors light up, and feature combos get fed to subsequent input via lots of bit banging - can't afford for it to be slow in the middle of that pipeline!

sidlls · on May 11, 2018

That happens. Like with bit twiddling in general computing it's not necessary in many applications.

slededit · on May 11, 2018

Programming used to be mostly bit twiddling. The universe has expanded considerably, but if your still in that original corner its really hard to not see it as a basic skill.

Javascript only has doubles so you can't even bit twiddle if you wanted - C programmers can't even imagine.

walljm · on May 12, 2018

Javascript stores integers in twos complement

slededit · on May 12, 2018

> JavaScript numbers are always stored as double precision floating point numbers, following the international IEEE 754 standard.

It doesn't have integers to store.

https://www.w3schools.com/js/js_numbers.asp

tripletao · on May 12, 2018

That quote describes a good mental model for someone learning JavaScript, but not necessarily the physical truth.

JavaScript's numbers can do everything doubles can; but if the compiler can prove that all values that a particular variable could possibly take on will fit in a smaller type, then it can use that. So a variable really can end up stored as an integer--or never stored anywhere for that matter, like if a loop gets unrolled and the counter is optimized out.

The bitwise operators behave as if integers are 32-bit two's complement. The typed arrays are also two's complement. A smart developer can write code that a smart compiler will compile to bitwise operations on actual physical 32-bit integers.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

https://v8project.blogspot.com/2017/09/elements-kinds-in-v8....

slededit · on May 12, 2018

That's an extremely dangerous assumption because of the implicit rounding and precision loss inherent in the underlying double type. An interpreter "may" use an integer under the hood but it still has to behave as if it were an IEEE double, that includes all the nasty things you don't want in integer math.

Most C programmers would be too scared to perform bitwise operations on doubles and JS doesn't add anything that makes it safer.

tripletao · on May 12, 2018

Can you give an example of math that would be exact with 32-bit integers, but is inexact in JavaScript? Floating-point math on values that happen to be integers is exact, if an exact answer exists and fits within the mantissa.

slededit · on May 12, 2018

I think we're approaching this from philosophically different directions. On your end if the programmer controls everything correctly then its safe because doubles have a 53 bit significand.

The problem happens when you treat numbers one way and they are treated differently elsewhere. For example -0 when undergoing a bitwise +/- check (val & (1U<<31)) will appear positive when its negative by definition.

The cast to an integer prior to the bitwise operator looses information. You can use the operators safely if you know there is no information to be lost, but it is not type checked like with C. I will say at least JavaScript guarantees twos compliment. You never know when your favourite compiler will decide to break your code in the name of a faster benchmark - correctness of existing programs be damned. Implementation defined used to mean do something sane, but I digress.

tripletao · on May 12, 2018

The trick is to let your "double" only take on values that are also valid 32-bit integers, like with lots of gratuitous (x|0) and stuff. That excludes your (indeed problematic) -0. JavaScript's semantics give you enough to implement 32-bit integer math this way pretty easily. The speed may vary dramatically with small changes to the compiler, but correctness won't.

slededit · on May 12, 2018

I still stick by my original point that:

((int)dblval) & flag

would require serious justification in a C code review. In JS its all you have. Under those restrictions I'd use it as rarely as possible. But yes you can use it correctly if you try.

tripletao · on May 12, 2018

That's certainly true. The only reason to play these games is that JavaScript gives you no better option.

frockington · on May 11, 2018

In my career I have never had to know any binary number. I am on the application side of things though and try to void working with infrastructure at all so don't know about them

scraft · on May 11, 2018

In the computer games industry, or at least at the companies I have worked in based in the UK, these would be part of a test given either remotely or immediately on arrival and failing to answer correctly would make it unlikely to proceed any further.

In the c/c++ game code bases I have worked on, it is very common to have statuses stored as a uint with each bit representing something and very common to set, unset, toggle, reverse bits etc.

I actually read the first part of the article thinking these aren't hacks these are just things people use day in day out. But the later section about setting last 0 or 1 bit etc suddenly because things I wouldn't typically see in the wild.

majewsky · on May 11, 2018

At least in C++, wouldn't you just write these "hacks" only once, in inline methods of a BitField class? It's much more likely to mistype

  state = state & ~BIT7;

than to mistype

  state = state.without(BIT7);

scraft · on May 11, 2018

As daemin says, no, I am used to working with people that find basic bit field manipulation easy to work with[1], so there isn't any confusion.

Plus there are performance concerns. I remember one graduate introducing a c++ bit field class and tried to replace some core functionality with it and killed performance. Developers prefer to debug non optimized builds, and non optimized builds can end up running slow when you replace simple one liners with classes and inline methods that don't get optimized in debug builds.

[1]I hope this doesn't sound aggressive as it really isn't intended to. It is just everyone is so used to seeing bitwise operations it is as straight forwards as reading "set the third bit" or "toggle the last bit" when we see the C code.

dahfizz · on May 11, 2018

Would you create a wrapper class and member function to make addition less likely to mistype?

num = num.plus(x);

might be less likely to mistype than

num = num + x;

but nobody would ever do that. Once you've done bitwise operators a few times they become just as comfortable and natural as addition or subtraction.

daemin · on May 11, 2018

Nope, people actually just write the bit setting code. It's simpler and also I haven't found a good generic bit flags class that works as you intend.

slededit · on May 11, 2018

What about C++ bitfields? Not a library but a language extension. A lot of code I used has migrated to that.

daemin · on May 14, 2018

Normal C++ bitfields? Sometimes you get that inside classes in the form of "bool x : 1; bool y : 1;" etc. But that's much harder to use as far as function parameters go, so old school bitmasks are used for that.

coldacid · on May 11, 2018

I'd only do that if I could force every compiler to _always_ inline those methods. The cost of a method call for a simple bit operation is magnitudes greater than just doing the operation itself (tens to hundreds of ops versus 1-5).

barrkel · on May 11, 2018

Only a month or two ago I had to debug a data roundtripping issue with our Hadoop cluster. Decimals being sent in came back as apparently completely random numbers.

More experimentation revealed it was only negative numbers that were being messed up (the original tests passed with positive numbers).

Further bitwise investigation and manual twos-complement calculation, and I discovered the problem to be a mismatch in fixed point precision, where negative numbers weren't being sign extended correctly.

It's not every day you rely on your binary calculation skills, but it does come up.

Insanity · on May 11, 2018

How do you phrase the question? Because surely people you hired should have been able to answer it using modulus?

throwaway2016a · on May 11, 2018

See my note. I don't ask the question in interviews. I think it is useless, they don't need it for their job. I do ask a variant of fizz buzz that requires using modulus. They don't get hired if they can't answer that one. So this subset of people I'm talking about are ones who do know modulus.

MarkMMullin · on May 11, 2018

I ask it sometimes too - my favorite response opened with 'you want maintainability or speed' So I picked speed. They immediately set the first branch at mod 3 with the obvious justification - hired on the spot :-)

mathattack · on May 11, 2018

How often are bloggers really aware of what we MUST know or do?

debatem1 · on May 11, 2018

This was literally part of my first programming assignment in college. Can people really not do this? Isn't fizzbuzz considered too obvious these days? I feel like I must be misunderstanding...

muddyrivers · on May 11, 2018

You miss the point. I learned all these tricks in school as well. I loved them. I felt "smart". After working for so many years, I only remember the basic ones. If you ask me the "smart" tricks, I can't get them right out of my head. I can sit down with a piece of paper and may get them after trials and errors after a while. Guess what, the important thing is I know these "smart" tricks exist. If I need them in real work, I can google them out easily. Why need I remember them? In my real work, what I need to get right out of my mind are how our systems work, how MySQL/Postgres/Hadoop/HBase/Kafka/Spark/Linux/AWS/... work, etc.

Silhouette · on May 11, 2018

After working for so many years, I only remember the basic ones.

Several of the examples in the article are the basic ones, though: test, set, clear or toggle a bit. You might not remember the trick for finding the least significant 1 bit, but it's not as if testing whether a number is odd or even is some sort of arcane knowledge. Anyone who has to resort to Google or Stack Overflow for something so simple is going to struggle in large parts of the programming industry.

throwaway2016a · on May 11, 2018

Agreed. I had to do this first semester freshman year and again in my embedded systems courses. But a very large portion of web developers never went to college or only have a 2 year degree (and, sorry, but most 2 year "Computer Science" degrees glorified really expensive bootcamps).

justin66 · on May 11, 2018

They couldn't offer an answer at all, or they offered an answer that didn't involve bitwise AND?

throwaway2016a · on May 11, 2018

They couldn't offer an answer using bitwise operators. They knew mod 2.

Most web developers (front and backend) will never use bitwise and many don't have college degrees so they didn't use it in college. It is actually quite understandable that they won't.

I've used bitwise here and there in backend code in my 20+ years of web dev but never in a case where it couldn't be rewritten as slightly less efficient but more readable code.

craftyguy · on May 11, 2018

I would think a "if (x % 2 == 1) //odd" would be within the realm of most CS applicants.

kazinator · on May 11, 2018

If x is a signed type, and negative, are you sure the modulus is positive 1?

Since ISO C99, C's division operator has been required to truncate toward zero. The % operator harmonizes with that and so it yields negative remainders south of zero.

In C90, it was implementation-defined, IIRC.

You really just want to test for zero/nonzero, rather than x % 2 == 1.

craftyguy · on May 11, 2018

Good point. Interestingly enough, Python seems to handle this appropriately:

    >>> -4%2
    0
    >>> -8%2
    0
    >>> -9%2
    1

panic · on May 11, 2018

Yeah, Python is one of the few languages which actually defines sane behavior for integer division and modulus rather than leaving it up to the hardware. I wish more languages would follow its example!

triska · on May 11, 2018

Prolog is also like this:

    ?- x is -4 mod 2.
    X = 0.
    ?- X is -8 mod 2.
    X = 0. 
    ?- X is -9 mod 2.
    X = 1.

Using declarative integer arithmetic, we can even express very generally that a number X is a multiple of 2:

    ?- X #= 2*_.

This is true iff there is any integer (indicated by an anonymous variable _) that, multiplied by 2, yields X. It works for positive and negative integers.

For example:

    ?- -4 #= 2*_.
    true.
    ?- -9 #= 2*_.
    false.

A very nice reference on the general topic of bitwise operations is Donald Knuth's Bitwise Tricks and Techniques in Volume 4 of The Art of Computer Programming.

kazinator · on May 11, 2018

ISO C doesn't leave it up to the hardware. Integer division is required to truncate toward zero; the remainder harmonizes with that. (If someone designs a new HLL which leaves some arithmetic to the hardware that C doesn't, that is indeed silly.)

The sane language here Common Lisp: it does it in all the ways rather than picking one approach with some hand-waving justification about why it's the good choice.

It has truncate, round, ceiling and floor functions that all return a quotient and remainder.

There are also mod and rem functions that calculate just the remainder from a floor and truncate, respectively.

bakery2k · on May 11, 2018

Guido has written a blog post explaining "Why Python's Integer Division Floors" [1].

Python also uses floor division, and the corresponding "sign-of-modulo matches sign-of-divisor", for floating-point numbers. The blog post claims that this can be problematic in certain cases (compared to C's "sign-of-modulo matches sign-of-dividend"), but does not give any concrete examples.

[1] http://python-history.blogspot.com.au/2010/08/why-pythons-in...

LandR · on May 11, 2018

In C# this would be wrong

     (-123 % 2 == 1).Dump();

Prints False.

% in C# is the remainder operator, not canonical modulus.

tonyarkles · on May 11, 2018

Javascript as well:

    >> -2 % 2
    -0
    >> -3 % 2
    -1

Edit: well, at least in Firefox JS. I have no idea if that's actually standard-mandated behaviour or if it's an implementation quirk. The -0 is especially weird (I had written odd but...)

justin66 · on May 11, 2018

This is pretty interesting. I don't use js but I took a quick look and apparently (x % 2) == 0 as a test for even integers should still work properly there, since a comparison of -0 and 0 will consider them equivalent.

craftyguy · on May 11, 2018

Well that's not confusing at all. Why would C# change the definition of a common operator character?

bjoli · on May 11, 2018

I think you would be surprised if you started looking around to see how how % is handled.

justin66 · on May 11, 2018

> I would think a "if (x % 2 == 1) //odd" would be within the realm of most CS applicants.

Yeah, if they can pass a fizzbuzz question they can probably get that. At which point it behooves the interviewer to explain why (x % 2) == 0 to find an even number is a better test...

paddyredbeard · on May 11, 2018

I've been finding more & more in new hires that what they know coming in sometimes isn't as important as their willingness to continue learning. If they admit they don't know how to answer a question but are willing to give it a try, that's usually a good sign.

emilfihlman · on May 11, 2018

If what you say is factual I'm astonished at the level of incompetence.

throwaway2016a · on May 11, 2018

Using the word incompetence is ridiculous. An incompetent person can't do their job and I can tell you all these people were very competent at their job. They just didn't ever have the need to learn binary math. Bitwise math is not needed for a huge portion of software.

That's like saying someone is incompetent because their entire career they have only spoken English and someone writes "el al." to them and they don't know what it means. Sure english is based on latin. Do most english speakers know enough latin to know what "et al." means? Probably. Does that make the person incompetent at their job where they only need to speak English? Definitely not.

A large portion of web developers don't have any formal education and the bootcamps and websites don't teach this stuff. As someone who has a computer science degree it seems absurd, one might even say it sounds like incompetence, but it is actually completely understandable.

Most of them can come up with the modulus solution but the bitwise operator solution is less common.

To the other comment in reply to you that talks about Amazon... yeah, that is Amazon... I work for a company with 1/100,000,000 the budget of Amazon (wild probably inaccurate guess). We don't get the people Amazon attracts. We get the people who all those other companies turned down. But you know what? With extremely few exceptions all of them have been great software developers despite not knowing how to tell if a number is even or odd using bitwise.

ex_amazon_sde · on May 11, 2018

I interviewed many tenths of candidates at Amazon and I never met one who didn't know the very basics of binary numbering.

rahilb · on May 11, 2018

This is useful to know for counting the bits set in a word.

CapacitorSet · on May 11, 2018

Which isn't super helpful, because there is [a GCC `__builtin_popcount`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html), a POPCNT intrinsic if you're developing for SSE4, and [existing implementations](https://en.wikipedia.org/wiki/Hamming_weight#Efficient_imple...) you can look up at any time if you don't want to reinvent the wheel.

rahilb · on May 11, 2018

That's cool. I probably shouldn't have said useful and instead said interesting. My interest in these kind of operations is to learn lower level theory, not to produce production grade bit counting/twiddling code. In that sense, the operation is interesting (to me).

hedora · on May 13, 2018

The “clear the bottom bit” hack is pretty cool. It’s sometimes faster to roll a loop that uses that than to invoke the GCC popcnt intrinsic. If you pass an -march parameter that lets GCC emit the intended intel assembly instruction, then the intrinsic seems to always win (as expected!)

twiceaday · on May 11, 2018

The following is my go-to for bit hacks

https://graphics.stanford.edu/~seander/bithacks.html

deadgrey19 · on May 11, 2018

Yes! This and only this! The posted "bit hacks" are little more than bit level manipulations exactly as intended. The Stanford hacks are truly excellent. My favourite is: https://graphics.stanford.edu/~seander/bithacks.html#RoundUp...

cozzyd · on May 11, 2018

That's really clever, but extremely inefficient. Something like 1 << (sizeof x - __builtin_clz(x)) should be way faster.

Edit: you obviously have to multiply by 8 to convert bytes to bits

glangdale · on May 11, 2018

By the unwritten conventions of bit hacks folks, 'builtin_clz' is not a primitive (and elsewhere in that same document you will find a brace of different ways of doing clz/ctz). Many of these documents are old and the techniques in them older still (dating back to HAKMEM).

You are definitely correct on most modern architectures which have a single operation count leading/trailing zeros that are typically no more expensive than an integer multiply. This has been latency = 3, throughput = 1 for most recent Intel arch.

I think a redo assuming modern architectures is long overdue; i.e. let's work off the basis that count leading/trailing zeros, PDEP/PEXT, etc are fast and cheap and start from there.

Bonus round: if you had a lot of these operations to do, then you still might be able to go faster on x86 on SSE2 or AVX2 by using this trick, as the penalty to pass a SIMD register full of integers to a floating point op or vice versa is just one cycle. This trick becomes obsolete yet again with AVX512, which introduces parallel count leading zeros.

AstralStorm · on May 11, 2018

And neither of the bit twiddling is useful for ARM NEON as bit operations in vector form are very limited... (plus there are pipeline stalls)

Also multiply adds can be fused so if you're doing that it cab be faster to just multiply by a different number instead of bit twiddling.

glangdale · on May 11, 2018

I'm not sure this is true. Many of the bit twiddling hacks can be used in NEON and they have a few unusual instructions I'm dying to play with.

I'm not sure which bit hack you're talking about that's done with a multiply or multiply-add. There's a nice use involving De Bruijin sequences for doing lg2 of a single bit that's very instructive - is that what you meant?

saagarjha · on May 11, 2018

Of course, casting that int * to a float * is undefined behavior…

zyx321 · on May 11, 2018

Of course.

When you first learn about 'cool hacks' like these, you'll want to use everywhere.

Then you'll realize that they don't apply to certain data types in certain environments.

Then you'll come to realize that the rules governing which hacks to use are too complex to remember.

Then you'll want to write a program that figures it out for you.

Then you'll realize that those programs already exist, and they are called compilers.

So now you are back where you started. You have made no progress, but you have gained an important insight.

TeMPOraL · on May 11, 2018

And from that, at some point you realize that the corollary to your #2 says that such hacks do apply to certain data types in certain environments. And then you've gained a very powerful tool to be used in specific circumstances.

There's a reason a lot of these potentially non-portable tricks show up in high-performance computing (particularly in videogames) and in the embedded space.

zyx321 · on May 11, 2018

That reason being that sometimes spending a 10+ hours on a tweak that gives +0.1% performance improvement is worth it.

Otherwise, remember the first rule of code optimization for beginners: "Don't."

EDIT: And by "they don't apply", I mean they can crash your program or even silently corrupt your data.

kaslai · on May 11, 2018

Sometimes tweaks like these might only save 10-20 cycles, which in a vacuum doesn't sound like much, until you consider that before the tweak, it started at 26 cycles and is now 6 cycles, and it's called 50 million times every second, which is a savings of 100 million cycles per second.

For games, this can mean a higher frame rate. For data processing, it can mean the data gets processed significantly faster. It's especially important for things like video encoders, where even a 1% savings in CPU time can translate directly to 1% more potential revenue for the same operating cost at a company that relies heavily on video encoding.

Yeah, saving those cycles doesn't really mean anything to a beginner, but they can be a huge savings for high performance computation applications.

ithkuil · on May 11, 2018

Yep the correct way is:

union { int i; float f;} u; u.i=*i;

Now u.f contains the same bit pattern as the integer pointed to by "i" but interpreted as a float.

Compilers know how to generate code efficiently out of this.

Iirc this is implementation defined behavior, e.g. things like endianness alignment or padding, but it's not undefined behavior, whose evil effects can bubble up in unexpected parts of the program that apparently don't even touch the bad cast.

scatters · on May 11, 2018

No, that's also incorrect. The correct way is:

    float f;
    memcpy(&f, i, sizeof f);

Don't worry about the implied overhead of the call to memcpy; the compiler will replace it with an intrinsic and emit the expected movss instruction.

saagarjha · on May 11, 2018

Accessing through a union is legal in C, no? I thought that this was only disallowed in C++.

gpderetta · on May 11, 2018

Maybe? There is wording in the C standard (since C03 TC3 I think) that allow reading accessing from non active members of the union, but the wording is unclear and implementations have taken a very narrow interpretation of it (i.e. the access path need to be explicitly through an union type and the actual object needs to be an union). See the GCC documentation on union and aliasing for example.

jcranmer · on May 11, 2018

It's explicitly legal in both C99 and C11, but C99 forgot to remove it from its master (non-normative) list of undefined behavior. It's up for debate whether or not it's legal in C++ (ISTR a C++0x draft that copied the C99 changes to unions, but then they added unrestricted unions and completely changed the verbiage to have almost no relation to the C99 text).

tropo · on May 11, 2018

It mostly isn't legal, though gcc explicitly promises to support it and Microsoft's compiler doesn't even do the optimizations that would cause trouble.

TeMPOraL · on May 11, 2018

Wow, never realized the pointer cast version is undefined behaviour in C. Is C++ equivalent via reinterpret_cast undefined behaviour too?

Back in my C++ days, I would much prefer pointer-cast to union-struct, because the latter was subject to alignment/padding issues you mention, while for the former, I know of no desktop architecture that could mess up a int* -> float* cast.

I'm also doubly surprised because being able to do that stuff is what I'd consider the selling point of C/C++ over higher-level languages.

jcranmer · on May 11, 2018

Pretty much all pointer casts where you would use reinterpret_cast in C++ are undefined behavior, in both C and C++. The underlying rule for strict aliasing is essentially the same in both languages (you can only access an lvalue via a pointer of its type, with possibly differing signedness, const, volatile, or via a char pointer, or via a field of a struct that has the same initial layout if you're in that same initial layout portion).

Yeah, C and C++ aren't the language most people think they are, and the strict aliasing rules are the most clear manifestation of that.

tropo · on May 11, 2018

Trouble is, C was that language. Lots of people came to depend on this. The world needs a language with this ability.

The replacement is often "gcc with inline assembly" or "gcc with lots of __attribute__ extensions such as __may_alias__" or "gcc with optimizations disabled".

The need doesn't go away, despite the wishes of a language committee now dominated by compiler suppliers. From day 1, long before the language was standardized and even before the original (not ANSI) book was published, C was being used with aliasing. It was, after all, developed to support the creation of Unics (before that became UNIX).

saagarjha · on May 11, 2018

reinterpret_cast in C++ has essentially the same rules, with exceptions made for const qualifiers and such, as well as void , char , and std::byte *. Use a memcpy instead.

saagarjha · on May 11, 2018

Nope, it's undefined:

> Accessing an object by means of any other lvalue expression (other than unsigned char) is undefined behavior

https://wiki.sei.cmu.edu/confluence/display/c/EXP39-C.+Do+no...

ithkuil · on May 11, 2018

I don't understand your comment. The link you pasted explicitly mentions the union example as a "compliant solution".

That said, memcpy and a compiler that understand the memcpy intrinsic is probably safer; that said I have a vague memory of hitting an embedded compiler where memcpy wasn't an intrinsic

rcxdude · on May 11, 2018

This is technically undefined in C, but most compilers define it to work as expected as an extension to the language.

vvanders · on May 11, 2018

While that is a clever hack if you're looking for performance(which is where I've used these before) the float to int conversion normally hoses your pipeline(depending on platform obviously) and the solution below is much faster on any system I've worked with.

askvictor · on May 11, 2018

The first one (check if an integer is odd or even) is a hack, sure. But I can't think of anywhere you'd use it in the real world over (n % 2), which almost everyone will understand much quicker what you're doing.

The rest aren't hacks, they're bit manipulations, which are crucial for those who need them, and just about useless to those who don't.

aurelian15 · on May 11, 2018

Strangely enough, I often find my self impulsively writing (n & 1) to check for even/odd numbers. It just feels natural to me -- maybe because the first C programs I wrote were for µCs, where this kind of bit-twiddling tends to be used everywhere, and, correspondingly, people know what (n & 1) does. Sure enough, even on microcontrollers (n % 2) has no overhead compared to (n & 1) when optimisations are turned on.

It only ever caught my attention that writing (n & 1) is weird when some TAs looking at my code didn't understand what I was doing during my first college year.

khedoros1 · on May 11, 2018

It feels natural to me...but I also like writing emulators, so I've probably got more experience in common with µC devs than most other software developers do.

thankthunk · on May 11, 2018

It feels natural to anyone who understands binary numbers. Every odd number in binary will have the last bit as 1. Just like in the decimal system, every number that is a multiple of 10 will has the last digit as 0. If you want to see if a number is a multiple of ten, I guess you could do a mod 10 of the number and see if the remainder is 0. But if you understand the decimal system, you would just naturally check if the last digit is 0.

cozzyd · on May 11, 2018

Strangely, clang (but not gcc 7.2) with -O0 generates different code for the two.

https://godbolt.org/g/N8mTL6

jerrre · on May 11, 2018

why is this strange? -O0 is no optimization at all, so it isn't strange % yields an integer division right?

On -03 they compile to the same, both signed and unsigned.

cozzyd · on May 11, 2018

For some reason gcc is still optimizing it though!

pkaye · on May 11, 2018

Try using unsigned int. I believe they are equal only for positive numbers.

cozzyd · on May 11, 2018

Perhaps in some strange numbering system but not with two's complement... (to be clear, the return type is cast to bool).

pandaman · on May 11, 2018

With many compilers (e.g. clang, msvc) this will emit actual division without optimization (and if the number is signed it's quite a lot of code). If such code is in the inner loops it can hit performance, which is already not great without the compiler's optimization. And if one ever looked for a bug that takes more than few seconds to reproduce then the debug build's performance cannot be over-valued.

Though if this is a hack then checking the last digit to see if a decimal number is even is also a hack :)

slx26 · on May 11, 2018

Which brings us to: knowing when to use bit hacks is as important as knowing how to implement them. And you usually want to know about compilers for that.

shaklee3 · on May 11, 2018

The and operator is faster than modulo in most processors. Although, the compiler will likely optimize it away.

raverbashing · on May 11, 2018

> But I can't think of anywhere you'd use it in the real world over (n % 2)

When your processor doesn't have built-in integer division and the corresponding compiler doesn't optimize that and you need it to work fast.

stochastic_monk · on May 11, 2018

“Bithacks”, as they’re called, are great for high-performance, low-level development. For C++ developers of my variety, it’s an indispensable tool for writing fast code. Outside of that, it’s primarily a curiosity, but I still think it’s a useful exercise that improves understanding of how a machine and its binary logic work.

hessart · on May 11, 2018

Well, you could use it while competing in the IOCCC:

https://www.ioccc.org/

ginko · on May 11, 2018

Wait, what? Checking (un-)evenness with (n & 1) is considered obfuscation now? I know no C programmer that would even bat an eye at that one.

jnwatson · on May 11, 2018

This stuff is now second nature to me after a couple decades hacking C, but boy if it was hard to find information on this stuff as a teenager in the early 90's.

It is easy to forget how much the internet, and the WWW in particular, has changed programming.

JustSomeNobody · on May 11, 2018

> It is easy to forget how much the internet, and the WWW in particular, has changed programming.

It really hasn't. It's just that web programming gets all the attention. Those who write hardware controllers or do embedded use bit manipulation every day.

But, really, if web devs aren't familiar with this, why not? Do they not have settings to toggle in their apps?

red_light · on May 11, 2018

I'm sure many will disagree, but bitwise operators and operations should be common knowledge for programmers. Why? Because it is linked to understanding how data is stored and manipulated by computers. Computer scientists and Engineers generally understand these operators, hackers and "coders" do not. With this opinion, the title does not match the content.

ladberg · on May 11, 2018

Is this not already common knowledge? It's taught pretty extensively in one of the required lower div CS classes at my university, and it's usually in other schools' curriculums from what I've seen.

toast0 · on May 11, 2018

Somewhere around half of the candidates I interview have a tough time with hexadecimal, and definitely can't do bit manipulation without help.

It's kind of depressing, but I've adjusted how I ask my problem since it's not actually needed to demonstrate the skills I want to see. It doesn't take long to learn bit manipulation when it's needed, even though it seems like a foundation of computing to me.

empath75 · on May 11, 2018

There are a lot of computer programmers who never went to school.

sannee · on May 11, 2018

So is calculus, yet apparently most CS students can't really integrate shit by the time they get to graduation...

wink · on May 11, 2018

Learned to integrate in High School, never used it again for years. Used in like 1 out of 5 Math classes at the start university. Haven't needed it for a good.. 13 years I guess?

Indistinguishable from "can't really integrate shit" without some hours of relearning.

lsllc · on May 11, 2018

Check out Hacker's Delight (2ed) by Henry S. Warren for more of this kind of stuff.

JdeBP · on May 11, 2018

A lot more, if the Amazon preview is any guide. It covers this ground with just the first couple of pages of chapter 2.

kps · on May 11, 2018

Slightly disappointed me in that much of it assumes ‘big’ processors (e.g. fast hardware multiply).

jacquesm · on May 11, 2018

What is this now, junior programming lessons?

You absolutely must know these because they're dead obvious and if you're just starting out with learning programming these are some of the first things you will/should learn, right after how computers store data.

biggerfisch · on May 11, 2018

While yes, these are pretty simple statements, most of them only being 2-3 operations, I think you might be interested to know that not every programming curriculum even covers "how computers store data". I know a fair few programmers who came from backgrounds that either never touch the difference between an integer and float, or if they did, it was only to the point of "this one stores fractions, this one doesn't".

For example, no python course for scientists is going to go on in-depth about the difference between big and small endian - it just doesn't matter when you're trying to make a simple graphical model, it's far too low level. Binary operators work and are useful in JavaScript, but how many 6 week frontend bootcamps are going to cover them?

What I'm getting to here is that I don't think you can just dismiss things like this because your coding background covered them - there's some pretty clear gaps in a lot of peoples' knowledge and educating them is likely only a good thing.

ShroudedNight · on May 11, 2018

> "this one stores fractions, this one doesn't"

This sounds like a fast trip to the land of direct comparisons of floating point numbers.

Further info: https://randomascii.wordpress.com/2012/02/25/comparing-float...

kuriho · on May 12, 2018

Isn't it sad that most of todays junior programming curricula favour copy+pasting 10s of main() method only "applications" over explaining data types and some basic bit twiddling?

vvanders · on May 11, 2018

Here I thought this was going to be something like finding the next power of two[1].

Like a few other posters have mentioned, this is all basic bit manipulation stuff that you'd see in any perf sensitive/memory constrained codebase.

[1] https://graphics.stanford.edu/~seander/bithacks.html#RoundUp...

partycoder · on May 11, 2018

Also useful to generate subsets since possible the number of subsets of a set of n elements is 2^n.

   for (int i = 0; i << n; i++) {
      ...
   }

(won't work for a large n)

Also... dividing by 2 is (n >> 1) and multiplying by 2 is (n << 1).

enthdegree · on May 11, 2018

looping in order of subset size can be done easily too: https://en.wikipedia.org/wiki/Combinatorial_number_system#Ap...

bio_end_io_t · on May 11, 2018

This assumes negatives are represented as two's-complement, but that's implementation-defined. Testing whether a negative number is even by ANDing it with 1 won't work in a system that uses one's-complement.

tomp · on May 11, 2018

Which systems don't use two's complement?

glangdale · on May 11, 2018

Unisys apparently ships some emulators of a old system that still use one's complement... much as I hesitate to ever link to any ESR material, there's a good article here:

http://esr.ibiblio.org/?p=7413

hal_9e3 · on May 11, 2018

https://en.wikipedia.org/wiki/UNIVAC_1100/2200_series#Archit...

"I'm not dead yet!" https://www.unisys.com/offerings/clearpath-forward/clearpath...

https://en.wikipedia.org/wiki/Ones%27_complement

jandrese · on May 11, 2018

A handful of ancient legacy systems. Very few people today need to worry about their webapp being ported over to UNIVAC.

glangdale · on May 11, 2018

If you like this, you should check out https://graphics.stanford.edu/~seander/bithacks.html as mentioned elsewhere and also the book Hacker's Delight.

I would love to see these updated for SIMD and modern architectures (which change the equation quite drastically for bit hacking with 1 uop PEXT/PDEP/LZCNT/TZCNT), but that makes for a pretty huge area.

indescions_2018 · on May 11, 2018

See also: The Aggregate Magic Algorithms page

http://aggregate.org/MAGIC/

For example, Population Count instructions exist. But may actually be microcoded to count one bit per cycle.

Vectorizing your tallies can yield log time speedups:

Faster Population Counts Using AVX2 Instructions

https://arxiv.org/pdf/1611.07612.pdf

filereaper · on May 11, 2018

I'd recommend anyone who's into bit-level hacking to have a copy of Henry Warren's Hackers Delight [1]

Many of those nifty tricks are found in production systems today.

[1] https://www.amazon.ca/Hackers-Delight-2nd-Henry-Warren/dp/03...

textmode · on May 11, 2018

Amateur k user.

  /k3
  bh01:{*-1#(2 _vs x)&1}
  bh02:{(|(2 _vs x))y}
  bh03:{n:|(2 _vs x);n[y]+:1;:[x<0;-8#|n;|n]}
  bh04:{n:|(2 _vs x);n[y]-:1;:[x<0;-8#|n;|n]}
  bh05:{n:|x;:[n[y]=1;n[y]-:1;n[y]+:1];|n}
  bh06:{n:-1#& x;x[n]-:1;x}
  bh07:{n:-1#& x;m:& 8;m[n]:+1;m}
  bh08:{n:(*|& x)+1;m:n _!#x;x[m]:1;x}
  bh09:{n:|x;m:*&0 = n;n:& 8;n[m]:1;|n}
  bh10:{n:|x;m:*&0 = n;n[m]:1;|n}

  /examples:
  bh01 -98
  bh02[-133;5] 
  bh03[120;2] / 120 set 2nd bit
  bh03[-120;6] / -120 set 6th bit
  bh04[127;4] / 127 unset 4th bit
  bh05[0 1 1 1 0 1 0 1;5]
  bh06[0 1 0 1 0 1 1 1]
  bh07[1 0 1 1 1 1 0 0]
  bh08[0 1 0 1 0 0 0 0]
  bh09[1 0 1 1 1 1 0 0]
  bh10[1 0 1 1 1 1 0 0]

Appreciated the help last time e.g. use of dyad # (take), indexing into list via juxtaposition (bh02) and use of dyad ? (find).

textmode · on May 11, 2018

bh01:{*|2 _vs x=1}

112233 · on May 12, 2018

  > if (x & (1<<n)) {
  >   n-th bit is set

Yeah, don't get a habit to do that. Instead:

  if ((x >> n) & 1)

if x is wider type that int or n, the result of (1<<n) may be UB, and in practice the generated code will be buggy.

pointernil · on May 11, 2018

- A little off-tangent maybe, but are there other professions where this kind of once important but today less important (as some claim) knowledge exists? Is this kind of "knowledge degradation" common in other industries?

- On a different note: how much of mem-space bloat can be attributed to this low-level neglect? Which of course is partly caused by modern languages / programming paradigms abstracting away this low level details for sake of higher level abstractions (which only seldom break, right ;)

- honorable exception: erlang bit-level pattern matching and all the power that comes with it. Btw. did any other language/lib even try to imitate that feature?

jiveturkey · on May 11, 2018

i dunno. only like 2 of these are “hacks”. the rest are sooo basic. and he doesn’t even have the classic swap 2 bytes/words (3 xors).

cowholio4 · on May 11, 2018

Surprised I didn’t see this one on the lists.

Shifting left by n bits on a signed or unsigned binary number has the effect of multiplying it by 2^n. Shifting right by n bits on a two's complement signed binary number has the effect of dividing it by 2^n, but it always rounds down (towards negative infinity).

hinkley · on May 11, 2018

Years ago I recall reading a SIGPLAN paper where someone was trying to create a programming language where they represented bitwise operations and bitfields without primitive types.

For the purposes of doing bit packing and other manipulations, they treated the data as an array of booleans.

dooglius · on May 11, 2018

Hardware languages such as Verilog work this way natively.

debatem1 · on May 11, 2018

Maybe I'm just off in my own little world but this seems a bit table stakes. Am I out of touch?

jacquesm · on May 11, 2018

tonyarkles · on May 11, 2018

While I am with you and the GP, other responses in this thread make it sound like we are in fact out of touch. Or grumpy old greybeards now... Kids these days...

jacquesm · on May 11, 2018

I suspect part of this is that hardly anybody that gets into programming professionally has any experience with digital logic, whereas the two used to go hand-in-hand.

But the level of this post is so low as to be next to useless, I clicked fully expecting to be amazed. The title could definitely do with an upgrade.

lallysingh · on May 11, 2018

This is the most clickbait programming article title I've ever seen.

jacquesm · on May 11, 2018

Agreed. I've flagged it.

sethgecko · on May 11, 2018

As a Python developer who recently wrote a pure Python ECDSA/bitcoin library, these are very helpful. However performance is not always better. For example I had this function:

    def point_mul(p, d):
        n = p
        q = None
    
        for i in range(256):
            if d & (1 << i):
                if q is None:
                    q = n
                else:
                    q = point_add(q, n)
    
            n = point_add(n, n)
        return q

but when i changed the loop to this :

    for i in reversed(format(d, 'b')):
        if i == '1':

I found out it performed much better.

tonyarkles · on May 11, 2018

I 100% expect you're correct in your assertion, but here's the big question: do you know why the latter is faster than the former?

f311a · on May 11, 2018

Python uses arbitrary-precision integer representation, so you don't work with bits directly.

juliangoldsmith · on May 11, 2018

>y = x & (x-1)

Couldn't you also do: y = x & (~1) ? I'd think that would be slightly more efficient.

Silhouette · on May 11, 2018

These aren't equivalent, though. The example from the linked article turns off the least significant 1 bit in x. Your version turns off the least significant bit in x, regardless of whether it is 1 or 0. They'll give the same result if x ends in a 1 or if x=0, but not in general.

Veedrac · on May 11, 2018

No, that unsets bit #0, not the lowest set bit in x.

Aka. that brings 01101₂→01100₂ correctly, but 01100₂→01100₂ instead of 01100₂→01000₂.

juliangoldsmith · on May 11, 2018

Ah, I completely misunderstood what the author was doing there.

hfdgiutdryg · on May 11, 2018

I can't imagine any use for #6 on.

Veedrac · on May 11, 2018

It's useful when you want to find the next free slot in some contiguous store. Normally you'd use ctz, but sometimes you only want 1<<ctz. An example is n-queens, eg.

https://codereview.stackexchange.com/questions/75517/n-queen...

accnumnplus1 · on May 11, 2018

How often is this doing work that the compiler is going to do for you anyway?

dahfizz · on May 11, 2018

I'm not sure I understand the question. The first one about detecting if x is odd or even is usually what the compiler will optimize the more common (x%2) == 0 to.

Other than that, these are things that you have to do yourself. You can't call compiler.setMySeventhBit();

If you need to be able to flip and check bits, you have to do it yourself.

thankthunk · on May 11, 2018

How are these hack? This is standard bit masking/manipulation? Every CS student learns it and the C Programming Language has examples of it.

jankotek · on May 11, 2018

Nice, but hard to read and compiler does it anyway.