Learn cryptographic engineering by example

tptacek · on May 30, 2018

This is more of a quick tour of the basic API for libsodium than a workshop for learning cryptographic engineering. Many of the exercises have no cryptographic component at all; the remainder basically exercise the most basic libsodium sign/verify/secretbox functionality. None of the exercises explain the rationale behind any of the libsodium constructions, and (because libsodium) nonce-based authenticated encryption is used without explaining any of the details of what a nonce is and what the requirements are for generating and handling them. Finally, the service model for the crypto involved doesn't make a whole lot of sense; most of the exercises build a "tamper proof log" from a simple chained hash function, and ultimately encrypts that log using a key exposed to the server anyways.

Respectfully, I don't think this is how you should learn cryptography (certainly: you shouldn't call this kind of work "cryptographic engineering").

I'm talking my own book here a little (but just a little, since it's not like I make a dime from this) when I say that the better way to learn and understand cryptography is stuff like the Matasano Cryptopals challenges:

   https://cryptopals.com/

These exercises will try to teach you crypto engineering by breaking cryptography, and without wasting much time structuring a trivial JSON interface. You'll understand what a nonce is by the end of set 3 because you'll have written exploits for otherwise sane cryptosystems that mishandle nonces. By the end of set 8 you'll have implemented invalid curve attacks and built and broken short-tag GCM AEAD encryption and, hopefully, be a little nauseous any time someone asks you to use crypto again --- which is the way it should be.

Even after you've undergone our cryptogaphic Ludovico Process, you still won't be a "cryptographic engineer". I've been testing and building exploits for random cryptosystems for over a decade and I'm nowhere close. The simple, blunt reality of it is that if you're going to build anything close to new with cryptography, you really do have to understand the math, and anyone who claims you can get to "securing a banking interface" without a detour through abstract algebra is, I think, doing you a disservice.

Another good resource for this stuff is LVH's Crypto 101: https://www.crypto101.io/

arkadiyt · on May 30, 2018

For anyone interested in Cryptopals Set 8 (still not published on the cryptopals.com website), I've compiled them here:

https://gist.github.com/arkadiyt/5b33bed653ce1dc26e1df9c249d...

loup-vaillant · on May 30, 2018

> the better way to learn and understand cryptography is stuff like the Matasano Cryptopals challenges

Compared to OP's tutorial, this is certainly a better way. I very much doubt this is the best way however, unless of course you actually want to do cryptanalysis.

Breaking stuff for real takes time. Learning that stuff can be broken is much quicker. I don't need to forge messages to be afraid of ciphertext malleability. Once I understand how XOR works, of course I'll run away screaming into the night at the sight of unauthenticated encryption. That said, I reckon doing the challenges is very good for street cred.

Also, some things just have to be taught. Forward secrecy for instance, is either like "I don't have the key, can't break anything", or "duh, you leaked your long term key, of course I can read everything". Exploiting breaches can help someone plug the leaks, but they won't teach them to secure their users' messages after law enforcement went for their long term keys.

And dammit, I don't aspire to be a crypto engineer. I just want to build a secure system. That said:

> if you're going to build anything close to new with cryptography, you really do have to understand the math

Oh yes. That alone warrants my upvote.

tedunangst · on May 30, 2018

Breaking stuff is still a good way to viscerally understand that tiny mistakes are fatal. Crypto that looks good turns out to be useless. The best way to really internalize that lesson is to see the crypto die. (I don't think it's strictly necessary to work through every example; looking at the problem, then reading the solution is probably good too, but one should spend some time looking at the problem, enough to say "yeah, this looks good" before watching it burn.)

tptacek · on June 1, 2018

It's interesting to hear that. Out of curiosity: have you implemented an invalid curve attack, or an attack against a GCM nonce repeat? Did you feel like that was instructive?

loup-vaillant · on June 1, 2018

Of course not. While I do expect it would be somewhat instructive, I don't expect it would help me build a secure system. I'm generally curious, but I do have priorities. Cryptanalysis is not one of them.

Monocypher uses DJB's curves, which are naturally immune against pretty much anything (assuming constant time primitive operations). Invalid curves don't reveal anything (though one needs to check against non contributory behaviour in key exchange), and the whole thing is constant time whether the public key/point was on the curve or not.

Maybe I would take a look at invalid curves attack if I ever try my hand at ECDSA or something, but (i) I don't plan to in the first place, and (ii) even if I did, it would be quicker to just learn about those attacks and how to avoid them.

Same about GCM nonce repeat. Useless. I have read that nonce repeat is catastrophic for GCM, and I trust that. Chacha20/Poly1305 is also vulnerable to nonce reuse (reveals the authentication key and the XOR of 2 plaintext messages), but I don't need to mount an attack against it to learn anything useful. Sure, this might give me further insight about how Wegman-Carter hashes work, but I'm not inventing anything here, so I don't need to understand that part in depth.

Even if I was inventing anything, I still believe being able to mount relevant attacks would still be mostly useless. I don't want to attack flawed systems, I want to build a flawless system. I would have to prove the system is flawless. Making sure the proof doesn't have an error is different from mounting an attack if there is.

---

Some people may need to perform the attack to really believe in their core that it is possible after all. I don't. Seeing the math is enough to send shivers down my spine.

I see one thing for which I expect cryptanalysis is genuinely useful: inventing new primitives that we cannot prove secure. Symmetric cypher and hashes, elliptic curves… Those require a deeper understanding, and I do expect one has to know how to break the bad stuff to come up with good stuff. There's just no way I try to elevate myself up to that level. I have no comparative advantage, and I'm not going to spend the 10 years required to have one.

pvg · on June 2, 2018

The cryptopals things aren't really about cryptanalysis. And it's, in hindsight, easy to convince yourself the problems of cryptography engineering obviously follow from the properties of the cryptography maths. But, historically and empirically, that's exactly how it hasn't worked out.

loup-vaillant · on June 2, 2018

Wait a minute, that depends at what level you are. Sure, you have to devise attacks to come up with the relevant mathematical properties. For instance, to protect against snooping, you encrypt, and you formalise the chosen plaintext attack. To protect against forgery, you formalise the chosen ciphertext attack, and we protect against it with authentication. I expect the same happened with man in the middle, forward secrecy, and others.

Me, I don't try to push the state of the art. I just try to protect against known attacks, and I trust we won't come up with new attacks too quickly, the same way I trust we won't break existing crypto too quickly.

From there, I just have to make sure a number of mathematical properties are followed, and voilà I have a secure system according to current standards. It will be guaranteed to hold out as long as no one comes up with some new unforeseen attack. And even then, I suspect everything has been pretty much worked out. The primitives themselves, with few exceptions, are still not proven secure, but the constructions have sound security models.

Which is why now, we don't need to do stuff like the cryptopal challenges to build secure systems. We just need to avoid the relevant pitfalls, which have already been figured out by smarter people.

---

Then there are side channels, but those are whole 'nother can of worm (except maybe timings, which are pretty well understood by now).

pvg · on June 2, 2018

Maybe I'm not following something but it seems to me you're saying an understanding of the mathematical properties of the primitives invariably leads to their secure implementation, combination and deployment. On its face, this is demonstrably untrue.

From there, I just have to make sure a number of mathematical properties are followed, and voilà I have a secure system

Or, et voilà, you have SSL.

loup-vaillant · on June 2, 2018

This was an oversimplification. First, take a look at my crypto library, Monocypher¹. I went for the simplest things I could find. The AEAD construction is a simple XChacha20/Poly1305. I didn't invent a single thing. So on that front, we're safe.

Then all I have to do is weed out the bugs. It's not trivial, but it is pretty simple. Thanks to all primitives being constant time, the memory access patterns are identical for every input of the same length. Test all the lengths from zero to some threshold, and you pretty much test all the code paths. Then you sanitise the hell out of the code, because this is Unsafe™ C we're talking about, and then you have a secure crypto library. Extend that philosophy to the entire system, and you're good to go.

Now let's say I did invent something. I was close to such invention when I implemented XChacha20 from first principles (that is, from reading the XSalsa20 paper, and doing the same to Chacha20). Making sure the "invention" was sound just required I understood the relevant maths, which in this case wasn't complicated at all (it's basically "reveal only the bytes the attacker could have guessed anyway").

The same apply when one designs a protocol. One must be aware of the relevant properties the protocol must achieve, and understand the maths required to ensure those properties. Once that's figured out, it's just a matter of not screwing up the implementation.

Crypto is often underestimated, leading to horrible security issues. But I think it is also often overestimated, that with "don't touch it unless Bruce Shneier said you could" or something. Really, applying crypto is not hard. A couple weeks of full time learning is more than enough.

[1]: https://monocypher.org

loup-vaillant · on June 1, 2018

Hmm, I realise I got a little defensive there. My point remains, but still.

jonbtow · on May 30, 2018

Really enjoyed going through the cryptopals sets. Thanks for gifting this to the community!

alistproducer2 · on May 30, 2018

I would like more description in the main readme. Tell me a little about what kind of skills and knowledge will be gained by going through the exercises.

apo · on May 30, 2018

As I write, this article is above another on the front page titled "A security vulnerability in Git that can lead to arbitrary code execution "

Makes me wonder if the "example" is getting pwned by a Git repo and the lesson I'll learn is to keep my system updated.

That said, browsing the repo online, it looks like a pretty useful tutorial.

IloveHN84 · on May 30, 2018

It would be good to have also examples of crypto over SOAP, a protocol which is vastly ignored in the wild (REST is preferred for its simplicity) but massively used in enterprise applications.

cweagans · on May 30, 2018

In my experience, SOAP is dying off pretty much everywhere. It's ignored for the same reason that we don't put steam engines in cars: it's unwieldy, slow, and wasteful.