It's a good system, especially compared with the current best practice of simply hashing passwords with bcrypt and calling it a day.
I can't recall it off the top of my head, but Facebook has a similarly impressive system with more secret sauce involved for performance at scale. I believe what they do is the following:
1. Hash the password with MD5(password).
2. Generate a 20-byte (160-bit) random salt (this is well over the 64 bits you'd need to defend against birthday attack collisions).
3. Hash with hmac_sha1(hash, salt).
4. Send this value to a separate server for further operations (mitigates offline brute-forcing).
5. Hash in a secret key with hmac_256(hash, secret). Note this operation is on a separate server. The secret key might be colloquially termed a "pepper".
6. Hash with scrypt(hash, salt) to make local computation slower.
7. Shrink the final value with hmac_256(hash, salt) for efficient database storage.
If any Facebook engineers are around, please correct me if I've missed or misinterpreted any part of that.
The current best practice of simply hashing passwords with bcrypt is fine, and anything past that which doesn't doesn't involve (probably per-password) use of an HSM adds only marginal value.
I wouldn't want anyone to read this and get the impression that "secret peppers" and multiple hashing rounds and HMAC were important components of a password storage system. They are not: they're things that message board nerds come up with.
If you want to be a step better than just storing passwords with bcrypt, your next step is to create an authentication service that runs on separate hardware with an "is this password valid" and "enroll this password" API and nothing else. The stuff people do instead of this is basically cosmetic.
>>The current best practice of simply hashing passwords with bcrypt is fine, and anything past that which doesn't doesn't involve (probably per-password) use of an HSM adds only marginal value.
Yep, I mostly agree. As I said elsewhere in the thread, I wouldn't really recommend this sequence to any company unless it were very large and had a mature ops team to handle it. There's a lot of diminishing returns here.
Yep, Tom, that's the FB solution - or rather, halfway up the stack of hashing is a callout to a service where HMACs are involved. These bring their own challenges: https://video.adm.ntnu.no/pres/54b660049af94
No, according to your talk, Facebook's solution is that password validation is pushed out to the front-end servers, who use back-end KMS services to do some (but not all) of the crypto.
I think this is a suboptimal approach. Can you tell me what the benefit of your layered approach is over simply adding to your KMS servers the APIs for validate(user, password) and change(user, oldpassword, newpassword)?
If your KMS service did password validation directly, you wouldn't need any of the layers in this architecture. HMAC would add nothing (it would be tautological, since anyone who could directly attack the hashes must have also owned up the KMS service). I still don't totally understand the MD5 step. You could just use scrypt and nothing else, and you could probably ratchet the work factor up because you wouldn't be billing the front end servers for those cycles.
I generally like, have recommended, and have built a few times the "software HSM" KMS approach you're describing here --- but only for "seal/unseal" and "sign/verify" APIs.
>No, according to your talk, Facebook's solution is that password validation is pushed out to the front-end servers,
Yes, although I don't work for Facebook any more, that was and probably still is the case.
>who use back-end KMS services to do some (but not all) of the crypto.
In fact, the backend service does a tiny amount of the crypto.
>I think this is a suboptimal approach.
Of course you do, it's not like I've not argued with you before, Thomas. :-)
>Can you tell me what the benefit of your layered approach
...Facebook's layered approach...
>is over simply adding to your KMS servers the APIs for validate(user, password) and change(user, oldpassword, newpassword)?
If you want your hash to be ... hashed, rather than lodged in some questionable silicon, not putting a fat crypto load onto the backend avoids the "thundering herd" problem when some fraction of 1.7 billion people want to log in.
>If your KMS service did password validation directly, you wouldn't need any of the layers in this architecture.
Quite, and if everyone flew instead of drove, we wouldn't need cars with layers of bumpers, crumplezones, seatbelts and airbags; but it's merely shifting the problem for 1.7 billion people.
Have I mentioned "scale"? I should mention scale. Scale is a thing.
>HMAC would add nothing (it would be tautological, since anyone who could directly attack the hashes must have also owned up the KMS service).
Yes. Fucking huge KMS service. HUGE. Trumpiness levels of -HUGE- and eating lots of power and redundancy.
1.7 billion people. That's a lot. Like 0.01% of it is 170,000 people. All logging in together. All over the world.
>I still don't totally understand the MD5 step.
Yeah, but I'm not betting that Mark's/whomever's coding at the time was focused on the future of password authentication.
>You could just use scrypt and nothing else
Yes, but where's the fun in that?
>and you could probably ratchet the work factor up because you wouldn't be billing the front end servers for those cycles.
The frontend servers are approximately precisely where you want the cost/chokepoint. There's a metric fucktonne of them and they are closest to the request, so by definition they are scaled to the load.
>I generally like, have recommended, and have built a few times the "software HSM" KMS approach you're describing here --- but only for "seal/unseal" and "sign/verify" APIs.
Last question first: large, but not Facebook large (I spent 10 years consulting on this kind of thing).†
You comment early in the talk that the MD5 step is somehow helpful for password dumps. That was the bit I didn't follow. If it's there because that's how password hashing worked before your team got to it, that makes a lot more sense. But then: it's not a "layer" of the onion so much as a sheen of dirt that needs to be washed off the onion. :)
I get that your auth problem is huge. Yuge. So big you wouldn't believe it. I totally believe you. No, wait, I don't believe you, that's how big I know your authN problem to be: unbelievably huge.
But here's the thing: you're already scrypting passwords. We're not debating whether you can use expensive password hashes. You already use expensive password hashes. I'm saying: the model where the KMS does a small bit of the password hash step and defers the heavy lifting to front-end servers seems like a suboptimal way to structure this:
* You have to bill cycles from the front end to do it
* You can't change password hashing without updating all the front-end servers
* It's harder to track usage because it's spread across a zillion machines
* You're more constrained in how you scale it (for instance, if you wanted to double or triple the work factor) because whatever your new scheme is, it has to fit with the existing front-end resources.
I'm not saying "wow, it's dumb that you built it this way". I'm saying, if other people are reading this thread thinking about how to do it:
* DO split authentication out into its own service
* DON'T have that authentication service be "HMAC as a service" and then do scrypt on your front-end service
YOUR MOVE, ALEC MUFFETT. I keep going until you unfriend me on Facebook so I can't see you wincing about these posts.
† I've assessed Facebook-large variants of this, though.
>Last question first: large, but not Facebook large (I spent 10 years consulting on this kind of thing).
I just spent 3 years living it for 50h/week. Hence why I am taking a vacation.
>If it's there because that's how password hashing worked before your team got to it, that makes a lot more sense.
That. It wasn't even me, it was done before I arrived, but it was done by a team of geeks with a tremendous nose for making the best of the database that they had available to them without pulling the old password-migration "log in with one password, parallel-encrypt with a new algorithm, and save the new hashes" - thing, because some of those billion people might never log in again for years. You would never stop migrating people.
I remember internal pasword algorithm migrations at Sun, at least there you could force the matter for 10,000..40,000 people.
But you can't force everyone to migrate at FB scale.
>But then: it's not a "layer" of the onion so much as a sheen of dirt that needs to be washed off the onion. :)
You can take that approach, but - again - when will you finish the task? Whereas wrapping one algorithm in the next is a finite task which is completable in a reasonable amount of time.
>I get that your
...Facebook's...
>auth problem is huge. Yuge. So big you wouldn't believe it. I totally believe you. No, wait, I don't believe you, that's how big I know your authN problem to be: unbelievably huge.
Well channeled. :-)
>But here's the thing: you're already scrypting passwords. We're not debating whether you can use expensive password hashes. You already use expensive password hashes. I'm saying: the model where the KMS does a small bit of the password hash step and defers the heavy lifting to front-end servers seems like a suboptimal way to structure this:
>* You have to bill cycles from the front end to do it
Yes. 0.1% of frontend cycles. <blank expression> And?
>* You can't change password hashing without updating all the front-end servers
...which happens three times a day, weekdays, and is moving to moreso.
>* It's harder to track usage because it's spread across a zillion machines
>* You're more constrained in how you scale it (for instance, if you wanted to double or triple the work factor) because whatever your new scheme is, it has to fit with the existing front-end resources.
Yes. For a site with wildly heterogeneous architectures in front-end deployments, I can see how that might be a concern; but even AWS leads people to standardise on having approximately-the-same-kinds-of-hardware-doing-approximately-the-same-things.
>I'm not saying "wow, it's dumb that you
...Facebook...
>built it this way". I'm saying, if other people are reading this thread thinking about how to do it:
>* DO split authentication out into its own service
...or some component of it...
>* DON'T have that authentication service be "HMAC as a service" and then do scrypt on your front-end service
Why not?
>YOUR MOVE, ALEC MUFFETT. I keep going until you unfriend me on Facebook so I can't see you wincing about these posts.
Wince?
> I've assessed Facebook-large variants of this, though.
Sorry for the delay. A Jazzercise class suddenly appeared in the coworking space I work out of, so I fled, and then I had to give a talk about Starfighter.
Responses: (when I say "your" let's just stipulate I mean Facebook)
* Your password validation overhead is .1% of current front end resources, but could be ratcheted up, and would be easier to ratchet up if they weren't shared by other things.
* I totally understand why you keep the old MD5 cruft around --- but would add that it's cruft that would be even less obvious if it lived behind an authentication server.
* I think it's safer, simpler, cleaner, probably easier to scale, and definitely easier to change authentication if it lives in its own service rather than being implemented (in part) on a generic application server. As usage shifts from HTML front-end to all API, you might even be able to keep app servers from even seeing passwords.
* By "assessed", I mean, worked on other people's systems at this scale.
So I guess I'd wrap up with a question: if you had this to do over again, from scratch, the way you wanted to, would you have app servers do a password hash and then entangle it somehow with an HMAC operation from a crypto service, or would you have the whole password hash done on the crypto service directly?
Putting the authentication service into a nice tidy centralised box does not actually achieve much, and may have architectural downsides.
Not the least of which is: if it's wholly in a service, then you have to authenticate the service; that's not such a big step from "if the hashed passwords are stored in a directory, then you have to authenticate the directory" of course - but if we were to equate the two systems because of the need to authenticate the {directory, service} then the service-based solution still has the downside of being a CPU hotspot and a potential single point of failure.
We're much better at distributing directories of data which is self-protected / needs no special treatment, than we are at building humongous scalable "secure" services with an enormous TCB and a physically enormous attack surface / footprint.
Yes. If I was doing this, sure, I would do this again. Curiously I am a big fan of password hashing rather than all-singing, all-dancing authentication services.
> you also have to authenticate the HMAC-providing service
Yes.
But, to look at the Facebook approach, what is the risk surface presented by the HMAC service?
Done properly in the FB approach the password is irreversibly hashed before it arrives at the HMAC component, and cheaply HMAC'ed and returned, where the onion of hashing is completed.
It's good to bidirectionally authenticate access to the HMAC service, but in terms of protocol it strikes me as less critical than in your scenario.
Either the HMAC is done properly (in which case the eventual hashes will verify for legitimate users) or - if someone inserts a "fake" hashing service - the HMAC'ed results will not validate, and a bunch of legitimate users will experience login failure.
( edit: there's a risk of exfiltrating the input to the service, but it's meant to be a shitload of work to achieve any evil with that input anyway, which also can be shorn of user-metadata and other clues thereby making it a bit less valuable )
Maybe I have missed something but to my mind this threat scenario fails (by dint of fake services, exfiltration, etc) in a "safe" manner.
=== Now === consider your "authentication service" approach.
Plaintext goes into... what?
The real service?
A fake service that returns "true" in all circumstances?
A MITM that exfiltrates the plaintext?
Where do you put the root of the trust chain to this service? In an SSL Certificate? Pinned? From which CA?
Simply: I feel that in centralised password authentication services there are a lot more potential shenanigans to defend against.
>> It's a chain of hash because that's how passwords were migrated from being stored unsalted, to salted, to scrypted.
I kind of figured that, thanks for confirming :).
Personally I wouldn't consider Argon2 yet for production, but only because I'd like to see it run at scale for a few years. PHC or not, I'm hoping it becomes more battle-tested in production use.
That said, I'd fully respect any team for using Argon2 and have no personal qualms with it.
If this jury-rigged, duct-taped "password hashing" scheme impresses you, I've got some land in Florida you might be interested in.
Seriously, though, it's a complete fallacy to think that more complicated password hashing schemes with lots of fancy steps are better. I was at the talk where this scheme was presented, and Alec Muffett himself said the only reason it was so complicated is because they had to layer stronger hashes on top of existing ones instead of revoking all outstanding session cookies (forcing every Facebook user in the world to re-authenticate).
I'm aware of all that, and I'm not impressed by the numerous hashing steps. I'm impressed by Facebook's commitment to migrating and future-proofing their security practices as they become obsolete.
Specifically, this means "wrapping" insecure hashes in more secure hashes and the addition of an encryption key stored in an HSM on a separate server.
Thank you! This is exactly what I was referring to. There's a slideshare of this talk too. I couldn't find when I first wrote that comment, but I bet I'll find it with "onion" as a keyword.
I can't recall it off the top of my head, but Facebook has a similarly impressive system with more secret sauce involved for performance at scale. I believe what they do is the following:
1. Hash the password with MD5(password).
2. Generate a 20-byte (160-bit) random salt (this is well over the 64 bits you'd need to defend against birthday attack collisions).
3. Hash with hmac_sha1(hash, salt).
4. Send this value to a separate server for further operations (mitigates offline brute-forcing).
5. Hash in a secret key with hmac_256(hash, secret). Note this operation is on a separate server. The secret key might be colloquially termed a "pepper".
6. Hash with scrypt(hash, salt) to make local computation slower.
7. Shrink the final value with hmac_256(hash, salt) for efficient database storage.
If any Facebook engineers are around, please correct me if I've missed or misinterpreted any part of that.