Can you point to a resource that explains what a carry propagation bug means? I ...

nickpsecurity · on Jan 14, 2016

When I see a question like this, I always type in obvious words like "carry propagation bug" into Google just to see if it's a lazy question or there's a failure by INFOSEC authors on a topic. I was shocked that all I got was garbage about all kinds of CVE's and such with no clear explanation of the problem. Not clear in terms of someone searching for an article rather than bug report. Only one I saw in a few pages of Google was:

http://blog.skylable.com/2014/05/tweetnacl-carrybit-bug/

StackOverflow had nothing useful with that phrase. Weird. I tried Modular and Montgomery Multiplication as they're important key words that should give you an idea about potential carry issues. I found these decent descriptions in my short Googling:

https://en.wikipedia.org/wiki/Kochanski_multiplication

http://www.hackersdelight.org/MontgomeryMultiplication.pdf

Best I can do with little time and the flood of irrelevant results I saw. Maybe need a thorough write-up on these sorts of things that shows up in Google. At least I serendipitously found a great paper on statically detecting flaws in error propagation:

http://pages.cs.wisc.edu/~liblit/dissertations/crubio.pdf

So, thanks for asking even if not an intended benefit. :)

pbsd · on Jan 14, 2016

Carry propagation bugs are more or less self-explanatory: the implementer failed to add every possible carry bit during a large integer addition or multiplication, and this turns into an incorrect result. This may only happen in rare edge cases, and therefore remain unnoticed for a long, long time. The sibling comment explains this better than I could.

A carry propagation bug is a special case of a fault attack. Fault attacks, in their most general form, are attacks where an adversary tries to induce an error during a cryptographic computation. The consequences of a fault during a computation are algorithm-specific; here are some examples:

- In RSA, a fault during a decryption or signature operation using a particular (CRT) implementation approach lets the attacker recover the private key [1]. You do this by obtaining an incorrect signature S for some message M you know about, and then recover one of the prime factors of the modulus with the easy computation gcd(S - M^e, N). This is often called the Bellcore attack.

- With elliptic curves, if you can convince a scalar multiplication to work on an incorrect point on the curve (either because the implementation doesn't check the point is valid, or via a hardware fault), you can likewise recover the secret key [2]. The process here is a little more complicated, but no less efficient.

- With AES, if you can trigger a fault in a state byte right before the eighth round, you can recover the full key very easily having both the valid ciphertext and the erroneous one [3]. This attack only makes sense for hardware implementations, since software implementations of symmetric ciphers do not generally leave much space for errors to occur.

In the Golang case, the carry bit failure leads to the Bellcore attack. All you have to do, in principle, is to harvest around 2^25 RSA signatures (and respctive messages) from a buggy target, compute a GCD, and you're done.

[1] https://doi.org/10.1007/3-540-69053-0_4

[2] https://doi.org/10.1007/3-540-44598-6_8

[3] https://eprint.iacr.org/2009/575

fryguy · on Jan 14, 2016

You can't store a 4096-bit number in 32-bit register, so if you are multiplying numbers together you need to do it a piece (limb) at a time. Given that the arithmetic is modular, you can get away with "leaving some room" and not using all 32-bits, but instead using more 32-bit registers and treating them as if they are say 26-bit registers that can overflow 6-bits. For instance, Curve25519 (255-bit modulus) can be stored with 8 32-bit limbs, or 10 26-bit limbs (among other combinations). So if you do repeated calculations, you don't need to carry all of the bits all of the time. I don't know if it's the case for this bug, but I know it's been a bug in the past that they waited too long to propagate the carry (the extra 6-bits) into the next limb and it's overflowed the 32-bit register. This results in an incorrect calculation.