The World’s Fastest DES Cracker

freefal · on Feb 2, 2021

Back in '07, my school still used a custom email protocol called Blitzmail that verified the user by sending a random number, having the user encrypt the random number via DES using his password as the key and then send the result back. One problem (there were many) was that passwords were 8 characters and DES only wanted a 56-bit key so the protocol just dropped the least significant bit of each character. So while everyone thought they had one password, they really had 2^8 = 256 passwords, where 'b' and 'c' were interchangeable as were 'd' and 'e' and so on...

Now that was easy to crack.

TwoBit · on Feb 2, 2021

Surely they meant to drop the high bit instead of the low bit... ?

droopyEyelids · on Feb 2, 2021

There must have been something like this going on with the BIOS password of my father's 386 when I was growing up.

He password protected it so we could only play for controlled periods of time, and chose "tricycle" as the password, but my sister and I guessed "bicycle" in the hours of typing every word we knew into the prompt. "bicycle" worked!

dogma1138 · on Feb 2, 2021

Dartmouth, I remember a talk about it not related to security it was one of the most late 80’s early 90’s “internet revolution” things I’ve seen in my life.

freefal · on Feb 2, 2021

Indeed. It was actually a great system that sat somewhere between email and IM in its usage on campus. The Windows and Mac clients had notifier processes that would signal the client to poll if the server alerted them to new messages. It made messages appear nearly instantly which they did not back in the POP/IMAP time-based polling days.

schoen · on Feb 2, 2021

I'm sad to think that it's probably not the world's fastest DES cracker. The owners of the world's fastest DES cracker have probably neglected to write a public web site about it. :-(

ai_ja_nai · on Feb 2, 2021

Sure, you can invest more money to bring that number down. But, historically, we have this now.

gnu8 · on Feb 2, 2021

They probably don't even use it anymore. It would be good of them to at least write something up, if not transfer it to a museum.

tasty_freeze · on Feb 2, 2021

> Today, with the advent of Field Programmable Gate Arrays (FPGAs)

Xilinx was founded in 1984, so the advent of FPGAs is at least that old.

The linked page estimates the price to crack a key via a CPU farm is $125K, and cracking one via a GPU farm is $20K. But they don't say what it costs via their system. The FAQ answers the question, "Why are you charging so much for this", but again, without referencing that price. I don't feel like submitting a credit card and generating a token to find out.

marcolussetti · on Feb 2, 2021

There is a price list on the website: https://crack.sh/get-cracking/

Toward the end of the page. Looks like $20 except for rush orders (for most types)

tasty_freeze · on Feb 2, 2021

Thanks! I visited the page but the table is off the bottom of my screen and I didn't think to scroll. I immediately saw the "submit a job" form and got put off.

All the same, it would be better to also state their range of prices on the page where they estimate CPU and GPU-based cracking costs.

wmf · on Feb 2, 2021

It's $20-$100.

fortran77 · on Feb 2, 2021

I'm surprised that from 1998 to 2020 they only got ~ 10x improvement in speed.

wmf · on Feb 2, 2021

You lose ~10x going from ASIC to FPGA but FPGAs probably got 100x better which nets only 10x overall improvement.

Palomides · on Feb 2, 2021

going from ~2000 ASICs to ~50 FPGAs and still being much faster is pretty good imo

tyingq · on Feb 2, 2021

The Deep Crack ASICs had 25 processing units per chip, so 44,544 "cores" total for the EFF DES cracker. The crack.sh FPGA setup has 1920 "cores" (48 boards, 40 "cores" per board).

So it's 10x faster with 23x fewer cores.

abiogenesis · on Feb 2, 2021

True, but the cost hasn't changed by the same magnitude:

> In 1998 the Electronic Frontier Foundation built the EFF DES Cracker. It cost around $250,000

> The system that you're renting time on is over $100k in specialized FPGA hardware

ARandomerDude · on Feb 2, 2021

$250K in 1998 is $400K today, if the inflation calculator I found on Google is correct. So there's a 4X decrease in inflation adjusted cost. Not an order of magnitude, but it's something.

hedgehog · on Feb 2, 2021

Those are also older FPGAs, price for the same performance today would be lower.

femto113 · on Feb 2, 2021

Especially when you consider a state of the art processor in 1998 was doing < 1,000 MIPS and state of the art processor now can do > 1,000,000.

stavros · on Feb 2, 2021

Per dollar? Or per watt?

abiogenesis · on Feb 2, 2021

It's only 25x per dollar (initial investment; not running costs):

> In 1998 the Electronic Frontier Foundation built the EFF DES Cracker. It cost around $250,000

> The system that you're renting time on is over $100k in specialized FPGA hardware

lmilcin · on Feb 2, 2021

Years ago I worked on my own HSM which used FPGA to implement DES. It had 3 copies of DES IP which allowed streaming 3DES. I could have put more copies of this on the FPGA but the single triplet was enough to stream data through it at astronomical speed. I think it was enough to saturate 1Gb/s and so I did not need more.

Compared to this, AES was much more difficult and it did not fit the FPGA I had and I eventually gave up.

xref · on Feb 2, 2021

If this can exhaust the DES keyspace in 26hrs, what’s the time to exhaust triple-DES?

dgacmu · on Feb 2, 2021

The triple-DES keyspace is 168 bits, vs 56 bits, so 2^112 times longer to do a complete scan of the keyspace. If you have a known plaintext and can do a meet-in-the-middle attack, it's approximately "only" 2^56 times longer. I'm not aware of words for the length of time this would take other than "past the heat death of the sun".

segfaultbuserr · on Feb 2, 2021

Yes. This is completely true if we assume there are no large quantum computers with practical error correction - we probably won't have one in the foreseeable future. But beware that a sufficiently large quantum computer will be able to run Glover's algorithm and reducing the brute-force attempts required from N to sqrt(N), making all 128-bit ciphers be 64-bit and turning 3DES back to DES. And I won't be surprised if it happens within my lifetime. But the solution for symmetric encryption is simple, just use 256-bit ciphers, fast and effective, and many are already in use today. Public-key encryption is the real trouble, many candidates, much higher cost, somewhat untested constructions.

kevin_thibedeau · on Feb 2, 2021

AES-256 has issues if you're planning for long term security:

https://soatok.blog/2020/05/13/why-aes-gcm-sucks/

tptacek · on Feb 2, 2021

I don't think the post-quantum block collision scenario is very plausible, and, at any rate, it's effectively an online attack (sweet32 follows the attack model Thai Duong and Juliano Rizzo came up with, where you're relying on malicious Javascript to trigger it). If we get to plausible quantum attacks of any sort, algorithms will get swapped out to regain security margin.

It's not like other quantum issues, where you're worried about transcripts collected today being broken 15 years from now.

gerdesj · on Feb 2, 2021

The author of that blog has this to say early on:

"This post is about why I dislike AES-GCM’s design, not “why AES-GCM is insecure and should be avoided”. AES-GCM is still miles above what most developers reach for when they want to encrypt (e.g. ECB mode or CBC mode)."

some_furry · on Feb 2, 2021

Hi, author here. This is precisely the takeaway of the post.

It's more of a collection of pet peeves about AES (in general) and AES-GCM (in particular).

gerdesj · on Feb 2, 2021

Thank you for your efforts and sharing.

I like to read opinion pieces that are pithy and well presented. I may or may not agree but I will think about the issue. In this case I know very little about the intricacies involved and its good to know that some people do. I've just recently converted about 45 Phase I IPSEC connections to AES128-GCM, AES-XCBC, DH Group 31 and similar Phase II. Most of them were 3DES MD5 G2, without PFS at Phase2 so I think it's an improvement! A big surprise to me was a major drop in CPU usage at the "hub" end. The hardware doesn't have AES-NI yet but it will soon be replaced and your notes have accelerated that change up my stack of stuff to do.

Sadly some of our peers seem to have forgotten to actually read the thesis first before shootin' wildly. I have quite a low Slashdot number and used to hang out there back in the day - nothing is new in the world!

Again, thank you for sharing.

dogma1138 · on Feb 2, 2021

This blog post isn’t particularly good more like rehashed rambling.

AES-GCM can be tricky to use if you implement it yourself, same goes for all other ciphers it’s perfectly fine when you use a proven implementation of it that ensures that it doesn’t go past its best before date for a given secret key.

I don’t understand why they made such a big deal of cache timing attacks, yes AES and any other cipher/cryptographic system that have time variable operation may be susceptible to timing attacks, and whilst ChaCha doesn’t have time variable operations (tho it doesn’t guarantee that a given cryptographic system implementation won’t introduce them) it’s still quite susceptible to fault injection attacks which are arguably much easier to execute on the CPUs that most phones and small devices run than cache timing attacks.

Overall when your threat model for a personal device includes a compromised application a lot of security assumptions should be thrown out of the window, it’s far more likely to compromise the OS itself than try to compromise the hardware on a phone specifically you’ll run out of battery before you’ll be able to do a successful cache timing attack anyhow.

If the blog was AES-GCM is easy to cock up so remember that nonce is a nonce for a reason and you should be using improvements over AES-GCM like AES-GCM-SIV it would’ve been fine.

How it is it reads like not particularly well constructed I R VERY SMART post.

Especially the quantum issues that primarily affect real time attacks, not offline ones which would allow you to decrypt historic data with ease.

If you have a drive encrypted with AES even “only” 128bit it’s probably going to be safe for a long while even in a post quantum world.

todd8 · on Feb 2, 2021

Back in the 80's I worked for IBM and invented a counter-mode based stream cipher. I did it to speed up overall network throughput in the presence of bursty traffic. The lawyers wanted it patented, so I agreed to meet with them and in their office, while working on the wording and patent claims they wanted me to add a bunch of features.

I just wanted to patent the basic idea of using counter-mode to generate a stream cipher from a block cipher. They said that it seemed like such a small idea, couldn't we allow it to generate simultaneous multiple streams, etc. So it was patented, but the patent is a bit of a mess with some half-baked designs for multiple streams. This was absolutely the worst way to design a crypto system: design one yourself while in the patent lawyer's office.

It still covered the basic idea of using a block cipher and a counter to generate a stream cipher, but I think the good stuff was buried in just a few of the claims. That's all ancient history now since the patent would have expired decades ago.

dogma1138 · on Feb 2, 2021

You can run any block cipher in CTR, there are 3DES CTR implementations out there, same goes for 3DES CFB/OFB modes. The only one that got wide adoption I think is CBC since ECB was always known to be well a problem.

Counter modes were petty well known in the 80's already I'm not sure what the patent could've been applied for but then it's IBM and their lawyers can probably find a way to patent waking up in the morning.

todd8 · on Feb 2, 2021

Yes, counter mode wasn’t novel at the time. The actual idea was related to encryption of network traffic. Back then, networks ran faster than encryption (processors didn’t have encryption hardware support). Since many networks have bursts of transmission the idea was to encrypt a counter with buffering of the results. This buffered stream of crypto values would then be XOR’d with the data when it arrived. The xor of course can run at full network speed since the encryption was done while during an idle time of the network.

The method takes advantage of network idle time so it doesn’t improve maximum bandwidth but instead reduces the average latency of a network not running at full bandwidth.

segfaultbuserr · on Feb 2, 2021

It's indeed a potential concern, basically the quantum edition of the Sweet32 attack. History will repeat itself (Although I wonder how much pre-quantum AES-256 traffic will really be vulnerable in practice. I think an interactive attack will be very unlikely, the concern is mostly pre-existing ciphertext. I guess most OpenPGP traffic will probably be secure? And even HTTPS will be okayish as long as you don't send too much data in a single connection? It would be an interesting study...) Fortunately, start using ChaCha20 or XChaCha20 is easy, unlike inventing and deploying a new public-key cryptosystem. So symmetric ciphers are still an "easy" problem.

[0] https://sweet32.info/

dogma1138 · on Feb 2, 2021

Historic traffic isn’t going to be much of an issue at all, as for AES-GCM specifically then nonce reuse will still likely be an issue well before a QSweet32 attack will be.

As for ChaCha et al, it’s still not that easy to start using it.

Not that many proven standardized implementations exist, hardware acceleration and performance issue are still there, the amount of research on these ciphers is still dwarfed by the amount of research done against AES and since these aren’t industry standards it’s harder to implement them in systems that require wide interoperability and are pretty much a no go in any industry that has its own standards.

Using extended ChaCha is cool for your side project, but try to implement it to say protect card payments in a system that needs to go through a PCI certification for example.

centimeter · on Feb 2, 2021

The factual content of this blog is fine but the delivery is horrible. Why would someone randomly intersperse drawings of their character in the middle of an article about AEAD constructions?

paul_f · on Feb 2, 2021

Maybe a dumb question but wouldn't a large enough key length be sufficient even in a quantum computing era?

tialaramex · on Feb 2, 2021

For symmetric encryption, as your parent explained, yes, simply double the key length and you're fine. (In practice you might get away with less than double because it seems unlikely such quantum computers, even if eventually constructed, would be similarly cheap to conventional computers)

But for asymmetric (public key) encryption the larger key lengths barely help, so you'd need infeasibly enormous keys. e.g. terabyte RSA keys have been "proposed" facetiously for this purpose.

Thus, Post Quantum Encryption research is mostly interested in finding solutions for two asymmetric problems: Key Agreement (a replacement for today's Elliptic Curve Diffie Hellman) and Signatures (a replacement for things like ECDSA) that would survive an attack by a larger quantum computer able to execute Shor's algorithm.

jgalt212 · on Feb 3, 2021

hence, 3DES is still in active use in industry.

RcouF1uZ4gsC · on Feb 2, 2021

triple-DES has a 112bit key space.

So it would take 26 hours * 2^56 which is a very long time.

UPDATE: A quick calculation on Google yields:

(2^56) * 26 hours = 2.13727751 × 10^14 years

LeoPanthera · on Feb 2, 2021

2̵6̵^̵3̵ ̵=̵ ̵1̵7̵,̵5̵7̵6̵ ̵h̵o̵u̵r̵s̵ ̵=̵ ̵2̵ ̵y̵e̵a̵r̵s̵

I am wrong.

segfaultbuserr · on Feb 2, 2021

It's a great underestimation. Breaking 3DES is not 3 times more complex than DES, but 2^56 times more complex. 26 x 2^56 = 1873497444986126336 hours. It's why we still trust 128-bit encryption when the total computing power of humanity already exceeded 2^80 operations per seconds.

It's frustrating that I has to refute the argument many times: "If 80-bit is insecure today, 128-bit will be insecure soon any cryptosystem can only guarantee a few decades of security because Moore's Law..." No, it's not how it works (although quantum computers will make 128-bit insecure, but the solution is already available today: 256-bit). The human brain is not wired to understand exponential growth.

chmod775 · on Feb 2, 2021

>It's frustrating that I has to refute the argument many times: "If 80-bit is insecure today, 128-bit will be insecure soon any cryptosystem can only guarantee a few decades of security because Moore's Law..."

I believe you are doing the people you are talking about an injustice here, because assuming Moore's Law held, we would be doubling computing power every two years (well, transistors really).

Doubling... what comes to mind... Ah yes! Bits! Because with every bit you are doubling the number of keys, meaning that you could crack an 81 bit key today in the same it would have taken you to crack an 80 bit key two years ago. So that gives you 100 years, or about 10 decades, to go from 80 bit to 128 bits.

Or, in other words, Moore's law is also about exponential growth.

So if those people you quoted did indeed say "a few decades" they were right on.

tialaramex · on Feb 2, 2021

> assuming Moore's Law held, we would be doubling computing power every two years

So the correct realisation here is that Moore's law (an observation by an engineer) doesn't trump laws of physics.

The transistors Moore was talking about have to be made from something. When they're made of a lump of material you can actually see under a microscope this feels both very real and as if it could be shrunk indefinitely. Just keep cutting that material in half!

But it can't. Matter is made of atoms. If you double the number of transistors you must halve the number of atoms in each transistor. Today's transistors have a few hundred atoms in them. Guess what happens when there's one atom in each transistor and you try to halve that? There's no such thing as "half" an atom, what you've got there isn't an atom any more, and so what you're making isn't a transistor.

The argument that we will never need larger symmetric encryption isn't based on ignorance of Moore's law, it's based on knowledge of the laws of physics. You can't make a 256-bit AES cracker by "just" converting the entire planet into Computronium, that's not enough compute power.

segfaultbuserr · on Feb 2, 2021

I don't agree that a century is "a few decades", my understanding of a "few decades" (and I believe, the person who was discussing this problem with me) is no more than 50 years. If you are clearly talking at the time scale of a century, I'll have no problem with this statement.

29athrowaway · on Feb 2, 2021

Is anyone still using DES or Triple DES?

dogma1138 · on Feb 2, 2021

Plenty of legacy systems still do, also it might be useful to recover historic data or credentials.

dragontamer · on Feb 2, 2021

Hmmm...

The paper "A Fast New DES Implementation in Software" calculated 64-parallel DES Round across a 64-bit register in 1,040 DEC Alpha instructions. There are 16 rounds per DES, and then another 2,500 clock ticks to convert the bitwise representation back into normal form.

The key to understanding things:

> The full circuit of DES contains about 16000 gates (including the key scheduling, which costs nothing), and thus we can compute DES 64 times in about 16000 instructions on 64-bit processors.

Each logic-gate can be seen as an AND/OR/XOR instruction on a 1-bit computer. Use 64-bit wide AND/OR/XOR to do 64x calculations in parallel. The modern equivalent would be 256x in parallel (AVX2 for Threadripper), AVX512 for Skylake-X, and 1024x in parallel for NVidia Ampere.

I dunno what kind of post-processing would need to be done, so I'll estimate an even 20,000 clock ticks per 64x DES result on the old DEC Alpha.

A modern GPU can probably execute 1024-bit per clock tick (AMD RDNA / NVidia Ampere 32-way x 32-bit), while a modern AVX512 CPU can execute 512-bit at a time. And AVX2 is 256-bit at a time (but probably faster: AVX2 / AVX512 execute XOR / ADDs at a rate of MORE than one clock tick per clock cycle).

The paper was very innovative: the SBox was a 100-instruction long sequence of AND / OR / XOR instructions. Avoiding the cache probably is a good thing, but I do wonder if there's a way to benefit from memory somehow.

* ~80 AVX2 clock ticks per result (20k clock ticks / 256-bits)

* ~40 AVX512 clock ticks per result (20k clocks / 512 bits)

* ~20 GPU clock ticks per result (20k clocks / 1024 bits)

So a 64-core AMD at 4GHz probably could do ~3.2-billion DES attempts per second.

A 216 SM NVidia A100 at 1.4 GHz probably can do ~15-billion DES attempts per second.

---------------

Hmmmmmm.... the results this page claims is ~16-billion DES attempts per second. Which does seem to be approximately what I'd expect a concerted effort on an NVidia A100 would get. I'm doing very "napkin-level" math here with tons of assumptions.

The SBox-lookup (100-AND/OR/XOR gates) might be better run on AVX / AVX512 however, because AMD Threadripper can do 4x simple AND/OR/XOR instructions per clock tick, and IIRC Intel AVX512 can do 3x simple bitwise instructions per clock tick. With the proper "cut dependencies", an SBox probably can be really executed in under 40 clock ticks instead of 100. CPUs could very well cut out hundreds of cycles from the above estimate.

The S-Box is very curious for DES: it is a non-reversible 6-bit lookup with a 4-bit result.

tait · on Feb 2, 2021

ir-Relevant XKCD: https://xkcd.com/538/