Surely just hashing the username|password would massively reduce the effectiveness of leaks like this? Sure, a hacker would know what the "salt" is, but since it now varies between users you would expend the same amount of effort breaking one person's login as you previously would spend breaking everyones (on average).
(Not recommending it, just wondering if my reasoning is correct.)
I hear this commonly, so it is a good idea to clear it up.
Usernames have lower entropy than a random salt and are predictable in many cases. People re-use usernames and some usernames are common. If your password system became common on the web, or if I knew the workings of your password system (i.e. open source / leaked codebase / Kerckhoffs's principle[1]), I could generate a rainbow table for either common or targeted users. This means I could generate a rainbow table for "Jabbles", gain access to your password and compromise your account before the website is likely even aware of a breach or has time to warn you. Salts only act to slow down, not prevent, compromising leaked password hashes (as you can always brute force which is quite practical with MD5/SHA1). Thus, using a username defeats one of the stated purposes of salting.
It's also said ad nauseam (with good reason) but rolling your own in security is a bad idea, especially when libraries exist that do exactly what you'd intend to do just as easily. Algorithms such as bcrypt and scrypt exist and are well vetted. bcrypt is easy to integrate with many languages and provides a trivial interface and sane defaults for iterations/rounds [brute force] and salts [rainbow table]. bcrypt can also handle increasing the security of your system over time as the metadata is stored as part of the hash.
tl;dr Using a username for salting means a targeted attack against a single or small number of users would be damn near impossible to stop as the second they have the password hashes they also have the passwords.
Often people say "Don't roll your own security" but the reality is that developers aren't trying to roll their own. They are trying to solve a problem, and if a quick google doesn't turn up a good library then they'll try and figure it out. Googling for password security implementations is likely to be fraught with horrible horrible advice.
I guess what I'm saying is that it's not enough to say don't do it, instead the defaults need to be there (and very visible).
I think we've reached a point with bcrypt that a good secure password system is within reach and comes with sane defaults and ease of use as features for most programming languages.
If it's just an issue of getting the word out there, then I'm hopeful things can improve.
You need more than just bcrypt. You've hinted at other things, but a few random things popping in to my mind:
* Preventing password logging (many web frameworks log parameters)
* Secure password recovery
* New alternative attack vectors (eg. Facebook, Twitter auth)
* XSS and CSRF
There are so, so many simple to make security errors, and worse - many of them are inter-related so that forgetting one will make another vulnerable. This is why you need safe defaults and more Security education.
A strong password hash doesn't gate on any of those things, so, while you do indeed need to pay attention to them, you don't need to pay attention to them before you deploy a strong password hash.
You should deploy a strong password hash immediately.
True point and this is probably off topic, but out of curiosity, what is the recommended approach for his point about logging messages/requests?
On previous projects, we've gone through all sorts of machinations to detect a password in our SOAP logging. This usually involves XML parsing (slow, ineffective on malformed messages) and Regexes (ineffective on malformed or "unusual" messages).
I can't think of anything better, short of "you can't leak what you don't log" which is nice in theory but not always practical.
There are defaults bcrypt and PBKDF2. There is no excuse for anyone to do anything less than salted hashes even if the decide not to follow bcrypt or PBKDF2.
Having a password salted with the username fairly easily balloons out the complexity of building and searching a rainbow table by a factor of the number of usernames you want to be useful for. This factor is larger then you'd expect, given the sheer quantity and variety of usernames in various systems.
For a targeted attack it really doesn't matter as the time complexity to produce the rainbow table is equivalent to that of simply brute forcing the hash, ie, you can't say 'well assume the rainbow table contains only some small number of usernames"...
It also is entirely unlike the WPA2 rainbow tables in that you don't have millions of users all sharing the same username (ie. factory default SSIDs).
Overall it's more secure then it seems at first glance but you still have to ask yourself why you'd use that over a random salt.
The targeted attack does matter though, for the reason I pointed out above.
I can produce a rainbow table offline before I compromise the targeted system as I know the username of my target. This is not possible if the salt is random. This means I can crack a targeted user's password hash _instantly_ upon gaining access to the system.
With a random salt, you can only perform the brute force attack on that targeted user _after_ you've gained access to the system and likely alerted them to a compromise.
If the response time of the compromised system and team is a factor, this means using a username as a salt compromises your security greatly.
tl;dr Using a username for salting means a targeted attack against a single or small number of users would be damn near impossible to stop as the second they have the password hashes they also have the passwords.
1) You know the hash function beforehand
2) You know that they are salting in exactly this way
3) You know how they are doing their salting (HMAC vs., vs.)
4) You have enough time to create this new rainbow table
5) You have only just enough access to the system to dump the hashes (ie. the easier routes are blocked off from you)
That would in fact, with some probability (based upon the complexity of your rainbow table and the complexity of the users password), give you the passwords for a particular set of users.
I did say that it was more secure then it seems, not that it was perfectly secure :)
While not entirely random, would a "date based" salt work as well? Say, the date that the entry was added? This would still negate rainbow tables as a specific user entry needs to be targeted.
It would probably work well enough, but... why not just add a proper random salt field that isn't tied to anything an attacker could guess? Is something like 8 bytes per user too expensive?
Remember salts don't need to be secret to do their job. The goal is to change the algorithm slightly (by adding additional input) for each user. That means you can't mass-precompute (rainbow tables), and just look up what matches, you have to break each user individually.
Your reasoning about how salts work is correct.
There's also something called a pepper which is another additional bit of input data, that is only stored in the app code (fixed for entire app). So an attacker who only manages to get a database dump would need to guess yet another chunk of data (making it near impossible). So a well-seasoned hash would be SLOW_HASH(pepper+salt+password).
Security is all about layers. Each layer protects a bit more, or prevents things from being easy for the attacker.
Edit: Don't do this yourself. Know it for the theory part - but then just use a well-vetted library to do it.
Please refer to my comment above. You can precompute a rainbow table if you know the username (trivial) and the method of hashing[1]. Whilst usernames as salts would increase security over no salt, it results in a potential exploit / vulnerability that would not exist if the salt was truly random. Hence, suggesting the use of usernames as salts is not wise.
I read cschneid's comment twice, and nowhere to I see where he or she specifically recommends using the username as a password; he or she simply recapitulates the logic behind using a unique salt value for each stored hash, and describes using an additional non-unique value which is not stored with the passwords ("pepper"), which is a new and interesting idea, at least to me.
Re: pepper - The devise plugin for Rails uses it. The idea is that the attacker must now steal both the app code AND database, which are often on separate servers.
It would make it a lot easier for LinkedIn to identify whose hashes were leaked because with a salt, all passwords would be unique. It would also make rainbow tables useless.
But in this day and age, the bigger problem is how fast you can compute the hashes, salt or no. With GPUs you can calculate a few hundred million(depending on the hashing algorithm) per second, making the algorithm used the real vulnerability.
Best practice involves increasing the calculation time of you're algorithm. Theoretically, you could just rehash y few thousand times in a loop, throwing in a salt here and there, but practically, you should just use bcrypt or scrypt.
In a password hashing scheme with a salt, you're supposed to consider everything except the cleartext to be public, for the purposes of analysis. The password should be unrecoverable even if the attacker knows the algorithm and any salts.
It's true that that would be an improvement, however we try to avoid discussing things like that seriously because of the risk that someone new to the game will actually try to do it. The easy answer is to use an out-of-the-box secure password strategy, anything else is adolescent.
(Not recommending it, just wondering if my reasoning is correct.)