Hacker News new | past | comments | ask | show | jobs | submit login
I mean, why not tell everyone our password hashes? (theobsidiantower.com)
296 points by jorkro on July 14, 2017 | hide | past | favorite | 151 comments



PSA to everyone responding to the title: please RTFA, it's sarcasm.


No, by all means share the MD5 hash of your passwords. After all it's a one way hash. /S


It's my understanding that even an MD5 hash of a not-terrible password is still virtually impossible to crack, is that wrong?

Here's an md5 sum of a not-that-great password I just made up. It's 14 characters long, but has plenty of guessable features. Is it crackable?

1cf016ea3cb1f2aa2ccb59c196d0e704


In reality while that would be really easy to crack (measured in minutes as others pointed out).

However, any possible password with a standard printable ASCII character set will typically be found in Rainbow tables up to 10 characters long making expensive cracking unnecessary. [not quite right see edit]

Rainbow tables are just giant tables where the key is the hash and the value is the string that generated it.

However, your example being 14 characters long is a bit long to be in most readily available rainbow tables.

This is why using salts and peppers are incredibly important regardless of what hash you use.

Edit: minor(ish) correction to the previous sentence. Full alphanumeric with punctuation and digits is available readily in smaller password lengths but the 10 character long datasets seem to be mostly only lower case characters and digits.


> However, any possible password with a standard printable ASCII character set will typically be found in Rainbow tables up to 10 characters long making expensive cracking unnecessary.

Really? Storing every possible 10 character long printable ASCII password plus its MD5 hash would require approximately 1.5 zettabytes[1].

[1] 95^10 * (16+10)


Rainbow tables are formed by chains of passwords and their hashes. The rainbow table only includes the ends of the chains, so you can throw away the middle of the chain.

Rainbow tables are a tradeoff between storing every hash, and generating them during cracking. You get to pick how much space you want to spend to speed up cracking.


>However, any possible password with a standard printable ASCII character set will typically be found in Rainbow tables up to 10 characters long making expensive cracking unnecessary.

Umm what? Even assuming a limited set of ASCII i.e. Base64, on what magical medium do you suppose a 64^10 rainbow table is stored?


Any medium really. Rainbow tables are compressed (by throwing away most of the hashes). The amount you throw away determines how long it takes to crack.

For example, A rainbow table might use chain lengths of 10,000. This means that for every 10,000 hashes calculated, only 1 (really 2) are kept. Each chain ends up as a row in the table, which is then sorted. When cracking, the target hash is hashed and reversed up to 10,000 times looking through the table.

The more compression the less space needed, but longer look up. The original Windows XP rainbow table cracking CD published along with the Rainbow table paper was only ~500Mb, but was able to crack pretty much every windows password.


An md5 rainbow table for lower alphanumeric which covers passwords of length 9 is 63gb. Length 10 is 316gb. You can see where this is going. It's important to note the caveat upfront; lower case-only plus numbers. No upper case, no symbols.

http://project-rainbowcrack.com/table.htm


That is just a rainbow table, but there are many others. By modifying the chain length, you can make the table as arbitrarily small. The example commands on that site use a chain length of 3800, but it could be raised to 1 million.


If you know the md5 hash, why not login with another password that generates the same hash value? Does it actually need to be the same passord?


It will work but a common reason to crack passwords is because people use the same password on multiple sites. Getting the wrong value may work to log into that specific site but will not work where the user the real password elsewhere. (unless that other place is using the same hash algorithm)


Yes, that will also work, hence why rainbow tables don't need to contain more entries than the possible set of hashes.


That's called a collision


Yes, that is incorrect. A GPU accelerated tool like HashCat can crack that password with a fairly small hardware footprint. Here's an article involving a 25 machine cluster which would reverse your hash in about 12 minutes -- regardless of your password features. http://www.zdnet.com/article/25-gpus-devour-password-hashes-...

This isn't nation-state level cost. Individuals could afford this level of hardware. Many individuals have access to systems of this size, for example through botnets, schools, spare junk in the local IT department closet, etc.

It's very reversible.


Don't forget spinning up an AWS cluster for 12 mins would not cost too much.


Well, you would pay for the full hour regardless of how long the machines were up. GCP would give you too the minute pricing however. But your right, even a full hour is really cheap


Uhh, 14 characters long.

Call it even ~30^14 / 348 billion per second = 1,374,416,379 seconds. So, they can break passwords with some pattern to them, but not really brute force em.


That's only 43 years and it was only 25 GPUs. Bump that up to 12000 GPUs and you could do it in about a month.

It's also an unsalted hash, so you could brute force an unlimited number of passwords at the same time without additional resources. Someone with a budget of a few million dollars could break every password in the world in a month.

So in other words, definitely don't publicize unsalted MD5 hashes of your passwords.


We can't really store that many passwords. It's even just 30^14 = 500 exabytes per byte and MD5 is 16 bytes at a minimum so you need 8000+ exabytes = 8,000,000,000+ Terabytes.

Note: only "In the third quarter of 2016, approximately 144.6 million hard disk drives were shipped worldwide" aka something like all HDD ever produced might fit that much data.

PS: Plus that 30 was low balling for a full search space it's 26 (lower case letters) + 26 (upper case letters) + 10 (numbers) + some number of special characters. So, ~100^14 or ~20,907,515x as large aka 10^17 TB.


You don't have to store all of the possible passwords, just the hashes for all of the real passwords that you are trying to crack.


If you mean after generating a hash you can compare it to your list of hashes then that saves time for cracking every password, but it's slower than cracking one password. Assuming you wanted to crack every Facebook password this way, that's not going to fit into cache and a binary search into RAM is actually rather slow. Yea, you could have more computers doing just this, but it's much slower than computing the hash in the first place.


You wouldn't need to do a binary search. The values are already MD5 hashes. You could do a hash table lookup in just a few CPU cycles since computing the hash is free.


Good point, though at a minimum you need to touch main memory twice which is 200+ clock cycles. And in practice that O(1) has an much worse constant factor.


There's no need to store anything. You have a list of hashes you'd like to reverse. You compute the entire search space and compare each.

No storage, other than the hashes you're attempting to reverse -- which for a data dump from even a large site like Yahoo wouldn't be very large at all. Megabytes.


Not at the same hash rate. Sure you can do binary search, but 100^14 or even 30^14 such searches is not fast. So, it's about as fast if you want 2 passwords but not 2 billion passwords.


It wouldn't be a binary search, the values are already hashes, so you could do a hash table lookup very cheaply. MD5 isn't great by cryptological standards, but it is extremely robust by hash table standards.

While no extra resources might not have been strictly accurate, the lookup would be practically free compared to the time it takes to compute the hash.


In the example I gave the hashes don't fit on GPU's and for your hash table lookup you would need ~64GB of ram to do the hash table lookups. You can scale this across multiple machines but even the ideal case of 2 lookups to main memory * 100^14 is slow and thus expensive.


Keep in mind article is from 5 years ago.


Sure, but "25 machine cluster which would reverse your hash in about 12 minutes -- regardless of your password features." is clearly wrong as GPU's are not that much faster and the article is talking about 25 GPU's.


You're right, that should read "25 machine cluster which would reverse your hash in about 12 minutes -- assuming similar password features"


I see a lot of people saying how easy it would be to crack this, but I don't see it cracked...


"Very easy to crack this! You just need a bajillion dollars, 5000 AWS instances and a couple minutes!"

Not surprised why no one tried yet.


I think it's a reasonable point. There's lots of armchair experts saying that md5 is broken, unusable, and anyone can reverse it, and here we are 14 hours later and nobody has proven it. Given that the claim was 12 minutes on a 25 machine cluster, that would imply 300 minutes of compute time which is 6 hours. This is hacker news, if it's not going to be done here, then no armchair enthusiasts are going to do it.

If someone can point me towards the tools and how to set it up, I'll leave my gtx1070 at it overnight and see.


https://hashcat.net/hashcat/

I haven't used it, but FAQs, Forums, wikis, and tutorials are all out there.


I will give it a try this weekend, thanks!


That's pretty much correct, yeah. Due to exponentiation, length is almost everything in password security. Which means there's going to be a bunch of lengths at which brute force cracking is trivial, and then a very sharp rise in complexity, after which brute force cracking quickly becomes astronomical, and then absolutely impossible.

If you look at the current cracking benchmarks of GPUs (https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...), there is an easily quantifiable difference between bcrypt and MD5: 21 bits. (https://www.wolframalpha.com/input/?i=log2(200*%5E9)-log2(10...)

That means under current GPU architecture, bcrypt is basically like "adding 3-4 characters (or 1.5 diceware words)" for free to your password. Can you basically just add 3-4 characters to your password? Sure, but not without user friction, and certainly you can't think that way as the developer of the system, because you're trying to give a small leg up to even the most vulnerable by salting and bcrypt/PBKDF2/Argon hashing.

What about theoretical limits? Well, there is another way to approach this: Landauer's principle (https://en.wikipedia.org/wiki/Landauer%27s_principle), which considers the theoretical minimum energy of a bit flip of information - so this even covers future computing technologies. Even if you used up all available mass-energy in the entire sun, it is only theoretically possible to perform 2^225.2 operations (https://security.stackexchange.com/questions/6141/amount-of-...). 225 bits of entropy is roughly a 35-character (printable ASCII) password.

(Note that you can't do this with MD5 - it has only a 128-bit hash space, before preimage attacks, the best of which lowers it to 123 bits).

So the lesson is: use slow hashes to give some protection to the vulnerable and people whose password complexity is "on the edge". Use a password manager so that the rest of your passwords can be comfortably > 128 bits in complexity, without reuse. And then forget about passwords because after that, every other part of the security system becomes more important.


A fantastic overview - clear and informed. Thanks very much for this.


hunter2hunter2


No. Just for reference. :)


Even SHA512 and bcrypt, totally uncrackable! /S


Every time someone says or writes "bcrypt", the GPU prices go up $10.


Gotta keep those hashcat farms in business :-)


One reason: you'd be surprised how many companies allow entering the hash as an alternative password to login to customers' accounts in production. Lazy method for customer support teams who don't have support tools to access customer information. Also frequently done to allow developers to debug problems on a customer's account when a bug cannot be reproduced elsewhere.

If such a company's database of hashed passwords is leaked, then an attacker doesn't even have to crack the hashes - the hash itself is a valid version of the password. Yet I've seen this behavior at multiple companies; only one of them pushed back against my request to remove that "feature", and I didn't stay with them much longer after that.


What would it take to get you to name and shame? That whistle pretty likely needs to be blown on the one that didn't agree to abandon such a policy.


Small private company, nobody's ever heard of it. There are a lot of shady ones out there.


Agree. MANY small development shops will build these kinds of backdoors into systems because they don't have the skill or the resources to build proper customer support features.


Remember "Chuck Norris"...


Microsoft Windows does this. NT hashes are password equivalent:

https://en.wikipedia.org/wiki/Pass_the_hash


I....have no words. Effectively clear-text password storage. That should be criminal by now.


I wonder about using HMAC with a secret key that can be changed for things like this.


Even some software do that. You can connect to a SAS server using your hashed password, which is stored in an XML config file on your computer when using EG.


That inspired this idea: make all password databases public, in an encrypted form. Just post them in a standard location. This is to get rid of the fiction that these are ever private and to eliminate an incentive to break in.


> make all password databases public, in an encrypted form

That is a terrible idea because agencies like the NSA or GCHQ with unfathomable resources and techniques will crack them and never tell anyone. Then you'll have a compromised account, the provider won't know, the user won't know. Then the agency would be able to compromise the account a publish whatever they wanted as that identity.

Given there are tricks to mask an IP address, or they straight up tap the wires, that's a #1 way to character assassinate any dissident or someone who they dislike.

> This is to get rid of the fiction that these are ever private and to eliminate an incentive to break in.

And why do you assume criminals wouldn't also try to gain access to the systems? Passwords aren't typically the valuable information in a system, they're there to protect the more valuable data.


>>That is a terrible idea because agencies like the NSA or GCHQ with unfathomable resources and techniques will crack them and never tell anyone.

As opposed to the current situation where they can just get the info from Facebook/Google/etc. directly? At this point you may as well assume state actors have access to anything you put on the internet.


> That is a terrible idea because agencies like the NSA or GCHQ with unfathomable resources and techniques will crack them and never tell anyone.

One, an agency with truly unfathomable resources and techniques is going to be able to get into your network even if you don't post the hashes publicly.

Two, all information we have (e.g., the Snowden leaks) implies that NSA/GCHQ/etc. are at best only slightly ahead of academia in terms of cryptanalysis. The only real mathematical revelation we had is that they did in fact deliberately compromise Dual_EC_DRBG, which the academic community had suspected almost since the standard was introduced, and which didn't even use any mathematics unknown to the public (the academic community knew how to build similarly back-doored systems, which is how they recognized such a system). It turned out that they had focused more on identifying and exploiting operational weaknesses (see also, "I Hunt Sys Admins") and not on discovering cryptographic attacks that the public didn't know about - so, again, they're already on your network.

Three, and most importantly, I'm in the US. I'm subject to the laws of the US. The US government is outside of my threat model, because they can just send me a national security letter whenever they want, and I can't tell my users. Or if they don't want to do that, they can just plant a mole. I certainly neither interview sysadmins well enough to tell if they're secretly working for the government, nor have I been interviewed as a sysadmin well enough for anyone to tell, either. (Remember that the mole could be an actual government employee who believes what they're doing is right, or just a smart kid who took a plea deal for buying some nootropic on the dark web.)

My threat model is everyone else. If the government wants to ruin one of my customers' lives, they can already do that, they don't need to hack me. My threat model is the mass media, my customers' abusive exes, random extortionists in Eastern Europe or somewhere paid by cryptocurrency, bored teenagers whose sense of morality hasn't yet developed to realize that SWATting people is a problem, etc.

Designing secure systems to be secure against the NSA is an extremely hard problem, and if you focus on solving it, you're very likely not to design systems that are secure against the actual attacks your users are at risk from.


There is a huge difference between actively breaking into every network and accessing all password data, and downloading some public file somewhere, with everybody's passwords in it.

The NSA is going to avoid the former as much as it can, because there is a huge chance they get burned in some way. Anything that they can passively slurp is a huge win for them.


> That is a terrible idea because agencies like the NSA or GCHQ with unfathomable resources and techniques will crack them and never tell anyone.

Chances are they already have 'em, from a compromised employee, a zero-day exploit, or a SQL injection hole. Far more likely than them having cracked bcrypt.


I doubt they have exploited every single password DB in existence.


Given what we know about older techniques, it's safe to assume that many intelligence agencies hold zero-days for most popular network and server gear. From my personal experience interacting with some of the people who use these tools, exploiting networks is neither free nor particularly difficult.


If they want it, they can get it.


That's still a better status quo than passively having access to all of it.


Make it blockchain-based, and you'll likely have some VC funding by tomorrow morning.


Who wants VC funding when you can raise 8 figures in an ICO! /s


It's only 1 PM in California. This will be funded before sundown.


wait ...


Instead of password hashes, why don't we just use Argon2id as a KDF to produce an Ed25519 keypair, and then publish the (salt, memcost, opscost, Ed25519 public key)?

I can throw this into a structure indistinguishable from a blockchain if any VCs want to invest ;)


As long as you call it a blockchain, I'm in!


Needs more FIPS to be enterprise ready.


Okay, let's throw in an invalid curve attack vulnerability and call it even. I'll contact NIST for a grant. Let's get this ball rolling!


Needs more IPFS to board the hype train.


Why Argon2id? Isn't Argon2i what the creators suggest?


Sufficient side-channel resistance for real world uses, sufficient GPU resistance. It's the best of both worlds. It's also going to be the libsodium default in the next release.

It's literally two passes that are memory independent, then two that are memory dependent, when r = 4.


Yes, KDF->pubkey seems like only sane way forward. Any discussion over old school passwords is a waste of time.


You've just described SQRL


No I haven't. I didn't invent PBKDF2-Scrypt along the way.


That's what I thought when I read the title.

There's probably some reason it wouldn't work. Dictionary attacks are an obvious possibility; if your password is "password" the only thing you're depending on is nobody being able to get at the hashes. It might also expose password reuse, though nonces/salts might solve that. Hrm.

This smells a bit like public crypto - public database of public keys (hashes), on login you're challenged to produce proof that you have the private key (the password), and the transformation provides you a means to do that without exposing the private key itself.


If you're using a keyed hash, then dictionary attacks can't be parallelized.


Wouldn't you want to use a salt instead?


You use the 'salt' as the key in the keyed hash.

The difference occurs mostly when you start chaining hashes. In that case, a salt is only relevant in the first hash, whereas the keyed hash needs the key at every hash round.


> You use the 'salt' as the key in the keyed hash.

I thought the two schemes were conceptually different, leading to different engineering tradeoffs: With salts, you assume the attacker can gain access to it. With keyed-hashing, you simply have a second piece of equally-secret information, and you hope it doesn't get leaked.


That's essentially what brainwallet did. Your bitcoin private key was deterministicly generated by your password. So using a list of common passwords, private keys could be pregenerated by attackers. So anyone who "created" a brainwallet with a weak password would get their money stolen the second they deposited it. Brainwallet got so much hate from people that used weak passwords and got their money stolen that it was shut down.


The foundation of modern ecommerce rests on the belief that your credit card information won't be stolen and used to cause great harm. Similar belief systems exist for online dating, social media, etc.


Isn't that more or less what blockchain-based encrypted storage is? I feel like I saw an HN post on something like that within the last couple months.


Isn't that more of public key crypto? So the secret isn't just a password, but also a key. I think it'd be a lot harder to crack.

Really it points to the idea that we should be moving in that direction for auth. Here's one project I've heard about: https://www.grc.com/sqrl/sqrl.htm


SQRL ftw! It's gonna take off any day now. That guy is prolific!

Seriously though, that's the first un-ironic reference to SQRL that I've ever seen.


They were being sarcastic.


Collisions?


In theory it could happen..

but something like this is strengthened through password stretching. I think this is good practice anyway as it makes them much harder to brute force/ dictionary attack if the data compromised.


A non-issue with salts.


What exactly are these passwords used for? The post mentions "controlling this object in the RIPE database" but I'm missing some context necessary to understand that.


Basically you can take ownership of their IP ranges, modify routing information, etc.

Even if you took a small percentage of the IP addresses in Europe, this could have a snowball effect. You take the IP addresses belonging to a popular mail service used by other domains, then you use admin email addresses to reset and eventually Europes internet is stolen.


It's not quite that simple. The RIPE database stores mostly administrative information, and doesn't _directly_ affect Internet routing.

In order to "steal" IP addresses (get them routed to you) you would need to buy a connection to at least one exchange point, probably several if you want all the traffic for the target to route to you and not just some traffic from some networks. You'd need to buy rackspace somewhere with a connection to the exchange point, install routers, establish BGP peerings with the exchange point (if they're doing route reflection) or with all the other major networks at the exchange.

There are multiple steps along the way where humans would look at the prefixes you were going to be announcing. This would include looking them up in RIPE, but anything more than a cursory inspection would likely reveal your ruse.

At this point it becomes more of a social engineering attack, and even if you got as far as announcing it, there are things like BGPMon that would pick up the fraudulent announcement pretty quickly and you'd likely find that the cable was pulled out of your router pretty fast.


Thank you!


Shots fired.

That ending was an incredibly well delivered stab at Deutsche Telekom. This is why I love vigilante security.


It depends on the hash type. Cryptographic hashes (MD4, SHA1, SHA256, etc.) are made to be efficient and fast to compute while password hashes (bcrypt, scrypt, etc.) are much more difficult to compute. The difference is staggering.

  john --test --format=nt

  Benchmarking: NT [MD4 128/128 X2 SSE2-16]... DONE

  Raw:	29037K c/s real, 29037K c/s virtual



  john --test --format=bcrypt

  Will run 16 OpenMP threads

  Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... (16xOMP) DONE

  Raw:	5472 c/s real, 490 c/s virtual


Edit: NT hashes are one round of MD4. These are Microsoft Active Directory hashes. OpenBSD uses Blowfish hashes by default.


What numbers does your rig give using hashcat and your video card?


Password "hashes" are generally just cryptographic hashes run multiple times (known as key stretching). Cryptographic hashes are also designed to be slow. A key stretching algorithm will only be slow if the underlying cryptographic hashes are sufficiently slow.


> Password "hashes" are generally just cryptographic hashes run multiple times (known as key stretching)

1. Password hashing functions are not regular hash functions run multiple times. This is not only false at a macro level (i.e. we don't just run SHA-2 several times to get something resembling PBKDF2), it's false in terms of core construction. Password hashing functions rely on fundamentally different mathematical properties than regular hash functions. It's not like 3DES and DES: a secure password hashing function requires more than just a higher iteration count.

2. "Key stretching" does not refer to running cryptographic hash functions multiple times. Key stretching refers to the act of generating a secret key from an otherwise weak passphrase or input, generally supplied by a user. You use the user's passphrase to (in effect) seed a function that outputs something much more resistant to brute-forcing. Key stretching is used in key derivation functions, but what you described is not key stretching.

3. General purpose (as opposed to password) hashing functions are designed to be fast, not slow. Take a look at BLAKE2's homepage for speed comparisons - speed is a selling point: https://blake2.net/. In addition you can read the following from the handy FAQ:

You want your hash function to be fast if you are using it to compute the secure hash of a large amount of data, such as in distributed filesystems (e.g. Tahoe-LAFS), cloud storage systems (e.g. OpenStack Swift), intrusion detection systems (e.g. Samhain), integrity-checking local filesystems (e.g. ZFS), peer-to-peer file-sharing tools (e.g. BitTorrent), or version control systems (e.g. git). You only want your hash function to be slow if you're using it to "stretch" user-supplied passwords, in which case see the next question.


Well that was embarrassing. Thanks for the corrections!

There's a Wikipedia line that could use your input (unless I'm missing why this would still be accurate).

> Key stretching functions, such as PBKDF2, Bcrypt or Scrypt, typically use repeated invocations of a cryptographic hash to increase the time required to perform brute force attacks on stored password digests.

https://en.wikipedia.org/wiki/Cryptographic_hash_function#Pa...


> > Key stretching functions, such as PBKDF2, Bcrypt or Scrypt, typically use repeated invocations of a cryptographic hash to increase the time required to perform brute force attacks on stored password digests.

This is not false, it is indeed one of the techniques that they use.


I'm not familiar with this term "key stretching" is this the same thing as "work factor" in bcrypt/blowfish?


No, "work factor" is a term that (roughly) describes how computationally expensive brute-forcing the digest will be. They're associated terms, but it would be more accurate to think of the work factor as the final result of the key stretching process.


Ah I see, it's concerned with low entropy.

I believe this paper might be the origin of the term(correct me if I'm wrong.)

I thought it was a good read if anyone else is interested:

https://www.schneier.com/academic/paperfiles/paper-low-entro...


It's also nice enough to mention the hashing algorithm used - MD5, just so you don't have waste time guessing.


Passwords are broken for precisely this reason. You are operating under the fiction that permanently handing over entropy from a limited source to an untrusted party, even through a (for a time) one-way function, is ever a good idea. Please do make all password hashes public. It will finally force the move away from passwords.


Nice, they even have a rest API and web form to update the information:

https://www.ripe.net/manage-ips-and-asns/db/support/updating...


Stupid question, but what does this particular password hash unlock?


I was wondering, too. From the article,

>"I hope like me you were immediately drawn to the ‘auth’ fields. As the name implies this field contains authentication information for controlling this object in the RIPE database. RIPE supports a couple of different auth types like Single Sign On (SSO), public key cryptography, and of course md5."

It's authentication to manage that entry in the RIPE database.


It's hard to imagine worse, except maybe putting the passwords in the clear...

Single unsalted broken MD5 is a far cry from scrypt... and even scrypt is probably a bad idea with all this crypto currency hashing hardware out there, unless you have a seriously strong password.

Just don't publish hashes.


The PGP option has been the preferred option for as long as I can remember (circa 2000).


Threads discussing rainbow tables are not applicable.

These hashes are not unsalted MD5. They are md5crypt ($1$[salt]$[hash]), as found in many Unix-likes and some Cisco IOS.


As you might have guessed, my password hash is password.


Were this true, you would have achieved a pre-image attack.


I like to think that if I ever gained the ability to execute preimage attacks, solve P=NP, etc. I would use it to troll everyone by making my passwords hash into "password", grabbing heylookicrackedpublickeyencryption.onion, and emailing prominent security researchers messages signed with their own keys.


a simple unsalted hash wouldn't work due to rainbow-tabling, and even a salted hash would be vulnerable to someone gaining unauthorized access to the salt and regenerating a rainbow table with it (although if one used bcrypt, that might be practically impossible)


There's not really such a thing as "gaining unauthorized access to the salt" when you already have the hash; the salt is just as secret as the hash, and the hash is useless as a means of authentication without the salt, so obtaining the hash, unauthorized or not, generally also means you obtain the salt. Libraries for algorithms like scrypt even usually give you one string which contains both a random salt and a hash.

You can regenerate a rainbow table which uses that salt, but you'd have to generate a rainbow table for every password, since each password has its own random salt. I don't know how rainbow tables work exactly, but I'd assume an old fashioned brute force attack or dictionary attack is cheaper than making a rainbow table for each password.


That's why you always want to generate a different salt for each password, which fully prevents rainbow table attacks.


ELI5


TL;DR Don't tell people your passwords, even if they're "obscured" by a hashing algorithm.


>whois -h whois.ripe.net DTAG-NIC

Wait, was that just a straight bash command? Is this installed on my computer?

>$ whois usage: whois [-aAbdgiIlmQrR6] [-c country-code | -h hostname] [-p port] name ...

Holy shit lol, that's neat.


> Wait, was that just a straight bash command? Is this installed on my computer?

Welcome to 1982..1985. This command predates bash.

https://tools.ietf.org/html/rfc812

http://minnie.tuhs.org/cgi-bin/utree.pl?file=2.11BSD/src/ucb...


Because your password is part of your identity and is actually used to cross check during identity matching.


Leaving aside the fact that changing my password doesn't mean I have a new identity, having the hash $2y$10$/Aglzm2zpHO7m1dIv5vSp.GHPUd1D8uODn/jtBv3gpe8yS5e/D9PW doesn't tell you my password is "tinkerbell".


In these kind of checks nobody cares what your password is. Only if it is the same as you are using somewhere else.

So hash, unless properly salted, works works very fine.

Many people actually use a single password everywhere. Or at least for similar things.


If your password is "tinkerbell", even an attacker with very few resources can probably crack the hash for it in seconds on their desktop PC.


My password is not "tinkerbell". It serves as an example.

My real passwords look a lot more like the hash.


Is your real password hunter2?


Nah, why even give an inch? Yes, if you properly deal with passwords, the artifact you store gives virtually no information to an attacker, but on the other hand, why give them even almost nothing?


Sure, in an ideal world: post the hashes, the salts, the hash algorithm, everything. If it's done "right" (e.g., the hash function has enough complexity), then brute force cracking, rainbow tables, etc. would take so long that it wouldn't be feasible to crack them with any volume.

Of course, you could still crack some (problem), so keeping multiple secrets hidden through obscurity (the hashes, the salts, etc.) is another layer of security.

This doesn't guarantee security, but it's certainly more secure. But it is additive: there's no reason to just use MD5 (or plaintext) because "my hashes are secret".


I was kind of disturbed that GitHub publishes every user's public key.

https://developer.github.com/v3/users/keys/

This is a different situation and public keys are not directly analogous to password hashes: there isn't a reliable way of cracking public keys in the same sense that there's a semi-reliable way of cracking hashes. But it was still strange and uncomfortable to me that they would reveal this "target" (and if there were specific key generation bugs, like RNG seeding errors, people might actually be able to crack a few of them and know that they had suceeded).

Relatedly, I was thinking about the magic crypto-cracking device in the movie Sneakers. Once they had it, they could immediately use it to log on to random network-connected services, defeating the authentication. So, how is that supposed to work? How do they automatically know what credentials would be accepted for a particular service? Are there common network authentication protocols based on public-key cryptography that have the property that the verifier tells the prover the public keys that it trusts?


You can have a GitHub specific ssh key.

  ssh-keygen -q -t rsa -b 4096 -N "passphrase" -C "mygithub@someaddress.org" -f ${HOME}/.ssh/.ghub
then in your ${HOME}/.ssh/config

  IdentitiesOnly yes
  Host github.com
    Hostname ssh.github.com
    Port 443
    User git
    IdentityFile /home/username/.ssh/.ghub
    ForwardAgent no
Not that it matters in this case, just sayin'.


In fact, I have a machine specific GitHub specific key.


2048 is enough.


For now. https://www.keylength.com/en/compare/

Why risk it when generating an ed25519 or rsa4096 keypair is cheap?


The same logic would apply to an 8192 bit key. One more bit doubles the key space. Someone who is be able to crack 2048 bit keys, probably also has the opportunity to crack 4096 bit keys. It may not be cheap for your communication partners to use your 4096 bit key. Smartphones and embedded devices want to use as less energy as possible. With an 4096 bit key, you force your communication partners to spent an unnecessary amount of energy.


> One more bit doubles the key space

Yes, except that a 4096 bit key is not just "one more bit", it's double the amount of bits.

> Someone who is be able to crack 2048 bit keys, probably also has the opportunity to crack 4096 bit keys

No, it would require an impossibly large amount of effort to crack 4096 bit keys compared to 2048 bit keys.

> Smartphones and embedded devices want to use as less energy as possible

They can use ed25519 then.

> With an 4096 bit key, you force your communication partners to spent an unnecessary amount of energy.

They spend more energy by running ad-ridden "apps" and electron monstrosities.


I agree. I used 4096 in the example just in case my great grandkids find this post. They will have Quantum implants.


I wouldn't really say this is significant in an attack sense - you need to reveal your public key to engage in many types of secure communication, SSL throws these around all day as does any private messaging app, PGP, SSH servers, etc. The entire point of a public key is that it is public, password material on the other hand is not, it's meant to be a shared key between you and the other party.

The only bad thing about the GitHub issue there is that a de-anonymization attack is possible as an SSH server will tell you if it accepts a given public key... if, say you had the same SSH key on your GitHub account and a server you wanted to keep private, this could be bad to say the least. And SSH clients offer every id_* key to every server they connect to, so if you connect to an untrustworthy server, even over an anonymity network like Tor your client may offer a key that identifies you (use your ssh config!).


If I remember, SSH server don't share the public key it accepts. Your SSH agent will try all your key one by one (you can change this behaviour with custom config)


Yes, but similarly if you have a given public key you can ask a server if it accepts that key with no need for the private key.

Basically, don't reuse keys in places where you might not want to be identified and use ssh configs to prevent announcing all keys to the world.


?"The only bad thing about the GitHub issue there is that a de-anonymization attack is possible as an SSH server will tell you if it accepts a given public key"

Could you elaborate more on this specific attack?


A quick overview of how SSH key authentication works:

> SSH client: I support key auth

> SSH server: Let's use key auth

> SSH client: Do you take this public key hash: XXXXXX?

> SSH server: Yes I do

or

> SSH server: No I don't

Repeat for as many keys as you like.

You can therefore grab a list of known public keys for a given person and ask a given ssh server if it knows about the given public key. Given a few days you could even scan the entire IPv4 space for servers taking a given public key. Username must match, etc of course, but it's an attack many people might not consider.


Expanding this to the tor case that the GGP outlined, even if the server isn't compromised the use of the same key in two very different contexts (for git and for your silkroad command console) reveals that the client is one and the same. I believe just packet sniffing can ascertain this.


Sure, you would still need a Linux userid on that host (I'm assuming that PermitRootLogin is set to No.) Although that's probably easy to guess considering a person's name is often available in from Github or even the default comment field that ssh-keygen adds.


I'm confused, why would you be disturbed that GitHub publishes every user's public key? This is quite literally the design intention of public keys.


I disagree. The design intention of public keys is not that they should be published along with a mapping to the user's identity, without the user's consent. It's that they may be published, or eavesdropped, without breaking the cryptography itself. See here[0] for the privacy-violating consequences of publishing public keys and identities wholesale.

[0] https://news.ycombinator.com/item?id=10004678


That's not a failure of public key cryptography, that's a failure of the SSH protocol.

As you say, public keys were designed to solve a key distribution problem. Inherent to that problem is the idea that a public key could become, well, public. They solve that problem very well, and there is no intrinsic reason why you shouldn't just publish them because they were intended to be defensible against that very eventuality.

Practically speaking I disagree that GitHub has done anything wrong here - changing habits to diminish the publish-ability of public keys because the SSH protocol exhibits suboptimal behavior encourages further lazy security for the SSH protocol.

We shouldn't tap dance around an SSH-specific problem by claiming that public keys need to be kept secret. That's absurd, we already have private keys. Moreover, it is detrimental to other protocols that rely on publicly verifiable signatures and nonrepudiation to adopt this sort of perspective.


> That's not a failure of public key cryptography, that's a failure of the SSH protocol.

But Github is using public key cryptography as implemented in SSH - if that has a failure, Github should take some blame for not working around it, especially when they are going out of their way to expose data that has little benefit IMO.

Anyway, SSH is orthogonal to one of my points, which, phrased another way, is that publishing the link between two identities (the key itself, and the key-owner's Github profile) without consent or need is unethical because it violates the privacy of the owner. I believe there is precedent in the PGP world (e.g., "I believe it's poor etiquette to upload someone else's key to a keyserver as you deny them that choice."[0])

I sort of get the "detrimental to other protocols" and "lazy security for the SSH protocol" points, but when you talk about publishing public keys, do you acknowledge a difference between "key XYZ is in use on Github" and "key XYZ identifies user ABC on Github"? I'm saying the latter is unwise and unkind, and it would be even if the SSH protocol didn't have this particular failure.

[0] https://stackoverflow.com/a/27254303


If you prefer plain-text format, this is available at:

  https://github.com/<user>.keys


Well, but these are the public keys. All you can is just to install it to your system and wait for someone to login using the private part... A lot of people would like to see the working way to get the private key out of public key :)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: