Hacker News new | past | comments | ask | show | jobs | submit login
Salted Password Hashing - Doing it Right (crackstation.net)
85 points by junto on July 5, 2012 | hide | past | favorite | 77 comments



Not this again! How can you give advice to use bcrypt/etc. and also suggest the following function?

    function HashPassword($password)
    {
        $salt = bin2hex(mcrypt_create_iv(32, MCRYPT_DEV_URANDOM)); //get 256 random bits in hex
        $hash = hash("sha256", $salt . $password); //prepend the salt, then hash
        //store the salt and hash in the same string, so only 1 DB column is needed
        $final = $salt . $hash;
        return $final;
    }
It's becoming ridiculous.


What is wrong with the above function? It's not obvious to me.


It's using sha256, which is far too fast. Key stretching is essential, especially when something as fast as a digest function like the SHA family is used. PBKDF2, as the article points out, can be used to increase the cost of brute forcing, but bcrypt should be considered before opting for PBKDF + SHA.

An article [1] that appeared on HN last month (comments:[2]) also explains why just using hashing with a quick digest function is a bad idea, although this original article does a decent job of it (despite the author ignoring his/her own advice).

This article also uses built in equality tests for comparing the supplied hash to the stored hash. This is bad practice, as it is vulnerable to timing attacks. [1] covers this in the Extra section.

[1] : http://throwingfire.com/storing-passwords-securely/?utm_sour...

[2] : http://news.ycombinator.com/item?id=4075873


"Key stretching is essential."

Agreed. The article at http://codahale.com/how-to-safely-store-a-password/ is also convincing in this respect.


SHA256 is very fast, which makes it relatively easy to brute-force if someone gets their hands on your hashed password file.


Even with long, unique salts?


Yes. A GPU can calculate more than two billion SHA-256 hashes per second, and adding 32 random bytes to each hashing is not going to slow it down enough to matter. Salting protects against rainbow tables; it's not an effective protection against someone just trying to brute-force guess passwords.


This is true. The bruteforce attempts will include the salts in, simply adding to the length of the textdata being hashed. After they get the matching string to the hash, it's an easy enough process to then figure out which part is the password and which is the salt. Attempts to get around this are done by things such as:

md5(sha1(md5(md5(password) + sha1(password)) + md5(password)))

Which is not an appropriate method to circumvent.

SHA512 is obviously a bit most costly, and therefore harder to bruteforce, but if you truly care about security you would be best to use PBKDF2 at minimum (built into Django's standard).


I'm confused. That combination of SHA1 and MD5 is, itself, a hash function, and should be brute-forceable using the exact same methods you would use to brute force any other hash function. It's also easy to GPU-accelerate.


I think the meaning was that combining SHA1 and MD5 is not an appropriate method to circumvent brute-forcing.


Salts do absolutely nothing to make cracking take longer.

The length of the salt might incur 1-2 additional calls to the SHA core permutation function, which is nothing.


I believe that the original comment thought it was bad because the hashed password is stored alongside the salt. But, in practice this is never really a problem, because if they can get your hash, then the salt is usually always recoverable as well...

This is where some people think they are being clever. Because they think to themselves,

"hey, if I keep the salt secret and don't store the salt in the same table, or in the same field, then I've got awesome security by secrecy".

So all they do is hard code a salt that they reuuse for every hash in their application. Which offers them a lot less security overall for their users.

I have zero problem with storing the salt alongside the hashed password, because in practice, it doesn't make anything less secure.


I have zero problem with storing the salt alongside the hashed password, because in practice, it doesn't make anything less secure.

Great, now please go and fix your software to use slow password hashing function!


You quoted me totally out of context (I don't use SHA and you are implying that I do. I didn't spot that in your original paste snippet, and clearly wrongly assumed that you were just referring to the concatenation of hash to salt.)

Nothing that I said was wrong (in fact it's sound advice), so I'm a little shocked at the massive downvotes, your blunt response, followed up with the patronising advice to rewrite all my software.

(And no, I don't use SHA. I didn't spot that in your snippet.)


Any article that says you need to carefully generate cryptographically random long salts shouldn't be trusted. No modern password cracking technique relies on predicting salts. Think about this for just a second and you'll see why.


Another thing to remember is if the user changes the password be sure to invalidate any logged in sessions. You don't want the logged in sessions to act as a backdoor into the system.

If you store the session key then delete them.

If you calculate a session key on the fly (as a hash) then mix the hashed password into your session key, so if the password changes (even if it changes back to what it was) the session key will be different.


"In step 4, never tell the user if it was the username or password they got wrong. Always display a generic message like "Invalid username or password." This prevents attackers from enumerating valid usernames without knowing their passwords."

What prevents the same enumerating attack against the sign up form. Are you going to give them a generic message that the username is invalid when it in fact has been taken?

Also the article implies that you need to use salt but than recommends using bcrypt which already includes salt.

Good read on how passwords are attacked.


This is a point where you'll have to judge the security considerations against usability considerations. As a user I might end up hammering a web service with correct passwords but incorrect usernames. As a user knowing that I'm using the wrong username would be useful and keep me from getting frustrated.

A better solution would be to rate limit incorrect username guesses. It's highly unlikely that a user is going to try more than a dozen usernames/emails - so that's strong signal that someone is trying to leak username information from your database.


It seems to me that a web application should not assist in its own hacking by allowing automated high-speed form submissions.

Why should a form that accepts human input accept input much faster than a human can generate it? Limiting form submissions to about one every second per IP should greatly reduce the value of brute force attacks without being perceptible at all to actual users.

If you're worried that will be too slow for your users, make it a tenth of a second. That should still be far too slow for enumeration or other brute force techniques to be worthwhile.


Surely after say, three invalid attempts, your app should lock the account and/or send an email and/or present a CAPTCHA?


Sure, that would be another way to stop automated enumeration attempts.


"What prevents the same enumerating attack against the sign up form. Are you going to give them a generic message that the username is invalid when it in fact has been taken?"

This is a good point, but I suppose that, all other things being equal, it's better to put the key under the mat than it is to put it on top it. IOW, the more sophistication your attacker needs, the better.


"What prevents the same enumerating attack against the sign up form. Are you going to give them a generic message that the username is invalid when it in fact has been taken?"

There are, more frequently, CAPTCHAs on registration forms.


Immediate username availability checks (via ajax) are fairly common, and they bypass captchas.


> What prevents the same enumerating attack against the sign up form.

Good point. Nothing prevents this, but it is easier to detect this kind of abuse on the sign up form and alert on it than all of the noise on the sign-in side.




Salted password hashing plain and simple:

Rule 1: Use a Modern Hash Algorithm (bcrypt, PBKDF2, scrypt)

Rule 2a: Use a Long Cryptographically Random Per-User Salt

Rule 2b: Have an additional 'system' salt that is a fixed value for the entire system and it's not stored in the database (better hide it in the source code)

Rule 3: Iterate the hash

Source: https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet

------

For ASP.NET the only proven option for rule 1 is PBKDF2 which is builtin (http://msdn.microsoft.com/en-us/library/system.security.cryp...)


That cheat sheet seems to have been written on the assumption that you're rolling your own, which is not a great idea. If you use Bcrypt, it handles salting for you, and you don't have to iterate it, since it has a tuneable work factor that you can tweak to make it faster or slower as your needs and Moore's Law dictate.


Regarding Rule 2b, why did the author keep repeating "store the salt in the database along with the key". Why would you want to do this? A cracker would have access to your hash and your salt. Wouldn't it be trivial for him to append the salt and then do a regular old lookup table on common passwords?

At least if the salt is within the source code, it's hidden from plain view. Or did I miss something?


The salt does not need to be hidden. First of all, it needs to be unique per hashed information (password), so you cannot store it in code. Second, its purpose is to force any attacker trying to use a lookup tables to calculate one lookup table per password. http://en.wikipedia.org/wiki/Salt_%28cryptography%29

Edit: also, instead of using a "system salt", why not use an HMAC to replace hash function?


Doing it right for most cases mean not doing it all. You shouldn't be thinking about hash functions and salts. Instead you should use a cryptosystem designed and implemented by experts.

DO NOT DO CRYPTO YOURSELF


Valid point, but what if you don't know who the experts are?

You're preaching to the choir on HN when it comes to PBKDF and Bcrypt and Scrypt and all that; but outside of our circle, people will consider anything documented on the net to be potentially expert advice.

Check this for a bunch of custom password hashing functions: http://php.net/manual/en/function.sha1.php


The problem is you still have to go out and successfully identify which cryptossytem was designed and implemented by experts and is appropriate for your use-case.


Very right. A lot of people do something naive like just hashing passwords with a standard cryptographic hash function, figuring that they're designed and implemented by experts, and they end up with a careful implementation of the wrong algorithm.

Of course, in the case of password hashing, the answer is pretty easy. (Spoilers: scrypt if there's an easy library for your language of choice, bcrypt otherwise, and PBKDF2 if you need to justify your decision to someone who habitually wears a tie.)


"Never send the user a new password over email."

This is what HN does right now. I know, because I forgot my password just yesterday.


> Every time a user creates an account or changes their password, the password should be hashed using a new random salt. Never reuse a salt.

Why not? The salt is still unique to that user. What's the benefit to changing it?


I use the hashed password as input in order to create short lived hashes for other purposes.

By changing the salt, you change the hashed password. So when the user changes their password it actually changes, and those other hashes change too.

If you kept the salt the same then if the user changes the password back to what it was nothing actually changes.


There's zero cost to changing it, and makes any previous work on computing hashes for that user useless.


"Prepend the salt to the password and hash it with a standard cryptographic hash function such as SHA256."

No! At the very least, you ought to be using an HMAC.


That's potentially dangerous advice, because as it may lead someone to use an HMAC and think themselves safe, instead of using something like Bcrypt that's slow enough to foil brute force attacks.


Does anyone else use either of these methods?:

1. Use memcached or asp.net cache to detect if a high number of login attempts are happening, if so, implement a 1 to 2 second sleep on each login attempt for 15 minutes (with some additional per IP slowing).

2. Put a sleep for 500ms on all login attempts.

I've been doing both for a while, with the thought that they are effective methods in conjunction with proper hashing.


This protects against people who are attacking your app through its public login interface, but when a file containing your hashed passwords gets stolen the thieves will attack the passwords directly.

http://codahale.com/how-to-safely-store-a-password/


This sounds like a reasonable addition to your security arsenal, and it should help to ensure that your own app is not being abused to brute-force your users' passwords. However, it doesn't provide you any protection if your database is compromised, and that is really when your hash security is most important.

You do mention this in addition to "proper hashing", so I'm sure you recognize this as well, but I think it is important to emphasize secure hashing practices before any talk of securing your app's login endpoint itself.


They aren't bad ideas, but unless your hashing method is intrinsically slow your delays will be bypassed in the event of a database leak.


This article left out one important item, in my opinion. You should also have a way of changing what algorithm is used in doing the hashing on the fly. Perhaps a "HashVersion" column so that you can easily migrate users from an older (possibly broken) algorithm, to a newer one as soon as they log in again.


That's easily handled by, say, a column in a database table. Or just use something that's very unlikely to need replacing, like bcrypt. As far as I know, the only people who really needed to switch password hashing schemes in a hurry were people who chose laughably broken ones to begin with. (Hint for people who don't know this yet: salting and SHA256ing is an example of such a laughably broken password hashing scheme, so you can save yourself the hassle of migrating away from it by not using it in the first place.)


Doing a column in the database is exactly what I suggested... Also, saying that anything is "very unlikely to need replacing" is extremely short sighted of you. And the point is not to necessarily do it "in a hurry" but rather to identify a mechanism for doing it at all, in a way that doesn't require resetting every user's password.


I misspoke -- I meant to say "very unlikely to need replacing soon." Look at SHA-1, for an example of this sort of thinking -- there have been some security problems found with it, and SHA-2 is recommended for anything new, and there's a contest to determine what algorithm will be SHA-3, but people who are using SHA-1 are not in dire peril. It's the "walk, don't run, to the nearest exit" approach to security. You pick something strong and hope that, if attacks are found against it, this process will be gradual enough that you'll have plenty of time to build the infrastructure needed to support migration, and migrate.

I just think that starting out planning for password hash function migration, when it's easy to retrofit later, is a bad case of premature optimization.


I do not see this as premature optimization. Do you store timezone information with your dates even if you are not operating in a different time zone? Do you store measurement information with a unit even if you love inches? I would say that it is certainly planning ahead, but it is not a premature optimization. It is planning for the all-to-likely event that you will need to start hashing your passwords in a different way.

"when it's easy to retrofit later" - I think this is the key part of your statement. When is it easy to retrofit later? You then have to pick your poison:

1. Switch entire system over to new hashing function - reset all user's passwords 2. Add in interoperability of hashing functions - what I'm suggesting you do from the beginning, making it much easier to do.

Number 1 is a horrible user experience, number 2 is much easier to do from the onset.


When is it easy to retrofit later?

There is a third way that is not poison:

No need to reset the passwords at one fell swoop.

When you decide to do a new password storage function:

New users get the new hash right away.

Calculate the new hash and the old hash when the user next logs in.

If the old hash matches, the user has logged in, and now calculate the new hash and store it over the old hash in the table. Perhaps make a note in a separate column that the hash has been converted.

Eventually, when all users have refreshed their passwords, quietly remove the old way.

Zero user involvement.


When I say "Add in interoperability of hashing functions", that is pretty much exactly what I mean. I realize I left out the specific method for converting the hashes over. And your separate column for when the hash has been converted is the equivalent of my suggesting for a HashVersion column. Also, you will likely never be able to "quietly remove the old way" because not all of your users are guaranteed to log in again. But yes, the idea would be to verify against the old hash, save the new hash when the user logs in.


I'm not sure how number 2 actually helps with security. If the hashing method is deemed insecure enough to stop using, would you not want to update all your users passwords stored in db to using a newer method? One method of doing so without having to reset passwords was posted: http://news.ycombinator.com/item?id=4083883 .


I would only say that you would want to immediately reset all user passwords if the passwords were leaked, not necessarily if the algorithm that you are using is bad for whatever reason. And the idea would be to give users a couple weeks (or days) to log in and then force the reset on all the remaining(maybe once you get to a certain percentage of your active user base).

I like the method that link provided, but there are some drawbacks, needing to update every user record with a new hash (offline process) - this is almost guaranteed to require taking the site down, which most people do not like to do. This is because you can't have some users with the old hashing process ,and some with the new.


The nice thing about bcrypt is that it stores the work factor and salt and hash all in one value, so you can very easily migrate to harder work factors. I have seen similar schemes used to store hashes from a variety of algorithms, you can simply have something like bcrypt$workfactor$salt$hash stored in the password field.


Yes, storing the algorithm used in the same column would be fine as well. But if you drop the "bcrypt$" from your example, you would still run into issues if bcrypt itself is broken.


Isn't that simply giving the attacker all the info they need to start cracking these passwords with a dictionary attack?


Sure - but the idea of salting is to make the dictionary too time consuming to create. The security is NOT in how secret your methods are (it rarely is).

Additionally, each salt is still unique per password, so the attacker would need to generate a full dictionary per record that they want to crack - generally not worth it.


I was thinking more along the lines of just running the top 100 passwords through each user


Salting is no replacement for strong passwords, this would work against most any salting scheme.


Not if part of the information is kept in code only, like iteration count on bcrypt


There is a chance that if your DB is compromised, your code is as well. Additionally, what if you want to change your work factor, how would you handle doing that? If you upgrade your server environment and then all of a sudden realize that your hashing algorithm only take .1 seconds, when it used to take .5 you might want to change it.


While that may slow down an attacker in some circumstances, a sufficiently secure password scheme will still be secure with total knowledge of the system available to the attacker. See also: security through obscurity.


The phrase A brute-force attack tries every possible combination of characters up to a given length. These attacks are very computationally expensive suggests that the author has not spent any quality time with JTR. Computationally expensive would not be equivalent to cracking billions of passwords per second which is what you can get with sha* and salts.


A question for the people here who do know: Coda Hale says "Use bcrypt. bcrypt. bcryptbcryptbcryptbcrypt", and appears to get a lot of support from those in the know. Right, got the note, makes sense. Now, does PBKDF2 count, too? I do a lot of .NET and it's built-in and "verified" (whatever that means) there.


PBKDF2 is fine. It's not as good as bcrypt, and since bcrypt libraries are trivial to use and pretty ubiquitous most people should probably just use one, but PBKDF2 is okay.


So would this be a decent way to store the resulting hash?

[method]$[salt]$[hash]

sha256$fi93heyf789s2hfk$j2398fdperoc983m4n58djs20

Also, I like how Django's authentication can cycle through a list of schemes making it easy to switch or accommodate legacy accounts.


The best way to store a password hash -- and I apologize for belaboring this point, but it bears repeating -- is to take whatever string you get from bcrypt or scrypt, and store it somewhere reliable with quick lookups, like a database with regular backups and replication. There are libraries that handle all the details of hashing a password, adding salt, making the computation too slow to easily brute force, and putting all this behind a trivial-to-use API. Use them!


Well, the article mostly discusses non-bcrypt solutions so the question seems reasonable. I suspect bcrypt is in limited use because it's not built-in to any of the development environments.


When I posted this, my intention wasn't to compare this against bcrypt as an appropriate method of storing passwords. I was more struck by the fact that the author had rather eloquently described both good and bad ways to salt and hash a password within the confines of the standard .NET framework. Notably, the bog standard, out-of-the-box Membership provider in ASP.NET uses the same algorithm as far as I am aware (while that doesn't make it the best, for many it may suffice their needs).

Although there is an open source BCrypt port to .NET from Java, it hasn't been verified in terms of its implementation as a third party library, and to do so costs bucks.

Therefore, the recommendation for .NET for increasing the compute factor is to use PBKDF2 instead of bcrypt since it is baked into the framework. It doesn't mean that is better, but if you are doing government work, then they will prefer you use a verified implementation, thus PBKDF2.


Do you realize that the phpass framework you recommend uses $2a$, even though you say not to use that?


For the record, it is trivial to change PHPass to use $2y$ instead of $2a$. That said, this does require extending the PHPass class, so it is not ideal. Still, it would be better to extend or modify the class than it would be to simply try to roll your own (if those were your only two options).


Just to be clear, I didn't write the article, and I didn't recommend anything. If anything, it was posted as a discussion piece.


Well then the article is saying don't use $2a$ and then saying "oh but use this framework which also uses $2a$", which is terribly inconsistent. It should discuss when and where $2a$ could potentially be a problem.


Full steam ahead on the critique. I wanted to point out that you had no need to start bashing me about it!


Why would anyone recommend using bcrypt and then say to not use bcrypt ($2a$)?


Because you are technically supposed to use $2y$ which ensures you are not using the broken implementation that was fixed in 2011. Of course if you are using a newer php installation $2a$ will be identical to $2y$ so it's just a disambiguation, just as $2x$ is used to identify hashes generated using the broken implementation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: