You don't have to store every compromised password ever to ensure users have a reasonable one. Just follow the NIST recommendations, and check against the top million, or top few million, and you're all set. If you use a good password storage library, and you slow down guessing attacks, or stop them altogether, your users will be safe. Those 300 million compromised passwords have a very long tail, and if you let a hacker attempt a few million password attempts running down the list, you've already failed.
IMHO, you only need to block the top few (thousand) if you have a policy to block rapid/unlimited guessing. The issue is targeted attacks, when a strong password was reused by someone between say linkedin and their corporate account.
If I can get a list of compromised email addresses and passwords, I will be trying those email/password combinations across services, not passwords from the top 100.
Attacking accounts with reused credentials seems to be pretty common.
> We have some ability at ArenaNet to watch hacking attempts live, and it tells a fascinating story. We watch as hackers use tens of thousands of different IP addresses to scan through millions of attempted account names and passwords, almost all of which are for accounts that don’t even exist in our database, looking for matches. They’re not guessing or brute-forcing passwords; they’re trying a very specific account name and password for each attempt. For example, account name “joe.user@example.com”, password “alligator101”. If they don’t get a match immediately, they may try a variant like “alligator100” or “alligator102”, then they quickly move on to the next entry on their list. And it’s interesting to see that the passwords on these lists are mostly quite good passwords. For every one account on the hackers’ lists with a password like “twilight” (real example, ಠ_ಠ), there are dozens of accounts with good strong passwords. So the world at large clearly knows how to pick good passwords; the reason people are still getting hacked is because they use the same passwords on multiple sites.
I suspect it would be better if possible to hit a bigger list.
It's not to protect against an external brute force, but a leaked database table being able to identify valid passwords, and then using that to access the service or check other services for a shared password.
It's unfortunately far too common that a SQL injection, bad employee, lost backup, whatever leaks a copy of the account table in the database, which even when using scrypt/bcrypt etc is plausible to slowly break accounts beyond the first few thousand common passwords. Depending on work factors, it's about 250ms to make a guess, which is about 345,000 guesses per cpu/core per day.
Adding a salt to the password before its hashed is the way to handle this type of attack.
In general I'd say that most organizations need to get less fancy about password security not more. Most mechanisms that companies employ to make things more secure don't. They make things less secure because they add attack surface, or they fail to account for the behavior of actual users when faced with any additional effort.
Adding a salt and hashing a password is not something someone should be doing in 2018. Use scrypt, or another one of the recommended algorithms, as it is a complete implementation that is well researched and can be simply dropped into most systems.
And my response was in reference to how many password you should blacklist, which using a good algorithm is still feasible to brute force easy to guess passwords in an offline attack.
Fair enough. It's hard to protect against users being silly. No matter how you salt, hash, and encrypt, if someone knows the username/password, they're in. I delegate auth for the systems I build to a SAML provider, and require 2FA. An expert can do a much better job managing passwords than me, and as long as I trust their identity document, we're good to go.
The limit on guessing can't be enforced if your databases are ever leaked, but even so they would almost certainly be per user, and if you just want a bunch of accounts to post spam, you could just select the 3 most common passwords and brute-force accounts.
Slightly more sophisticated would be to use a botnet to do this, so that you can't block the source.
What if the database of hashes gets leaked then there will be no possibility of throttling. Furthermore as somebody else set given a false positive the proper API may be used.
You can thwart or mitigate most brute force attacks with key stretching. Even if you only require 50 msec of work per password on current hardware, you have made the prospect of a brute force exponentially more difficult.
If the database gets breached with write access, a user's password can be bypassed entirely, even if it can never be discovered.
If the database contains all of the information the hacker wants to know, they don't need to hack your account, they can just read the tables directly.
By using a bloom filter, you'll deny a lot of perfectly valid passwords. Bloom filters are designed to be quick and give a lot of false positives, in exchange for never having a false negative.
If you're ok with not allowing a lot of valid passwords, than this approach has some merit, but user beware.
You can tune how much "a lot" is. In this case, the filter appears to be tuned for a 1 in 1000 false positive rate. How annoying that may be to your customers is up to you to decide. (I'm not sure about the utility in banning every exposed password ever anyway.)
You could also use this is a pre-check before querying the real haveibeenpwned api.
To me a cache is actually the inverse data structure of a bloom filter.
A bloom filter will return whether something is in a set with potential false positives but never false negatives.
A cache will return whether something is in a set but with potential false negatives but never false positives (i.e. item is not currently in the cache but exists in the set).
Calling a bloom filter a filter is a tautology and tautologies don't have much pedagogical value. Explaining what a bloom filter is (and/or is useful for) by imperfect analogy to caching is less precisely correct than a tautology but will help the uninformed a lot more.
Reminds me of (clever?) demo site that disallowed passwords for users if they were valid logins to other common services. That is, it would present an error on registration "Please do not use your LinkedIn password."
It would actually be pretty cool if this was actually a service offered by those major services. Obviously very rate limited.
Say example.com/.well-known/password-check
You provide a username or email and a password and if you get it right it will let you know and lock their account and require a reset when they log on next.
The someone like haveibeenpwned can run a meta service to check every service that registers with them.
I haven't thought through the security implications of this but it seems pretty mild on the surface. I mean there's denial of service potential but only when their password has been compromised so... good?
Could that not be used to reveal the password? Sure it won't work on this site anymore, but now the random attacker has a username:password combo.
Also, doesn't that just hand over the user's password to the site hosting major.service//password-check? Maybe the user is security conscious already, now their strong password is sitting in plaintext on major.service's misconfigured logs.
This just doesn't seem like a good idea to me.
Edit: Or a hacker gets into linkedin, now he's literally being fed the actual username/password combinations of every single person that signs up to any website anywhere. This just gets worse the more I think about it.
The service would likely accept a hash, not the raw password.
If it's rate limited more aggressively than the normal login endpoint, I don't see how it could be any more dangerous.
Anyone on the internet can already query any site with a guess for your username/password - if they're right, the request will succeed. How is that different?
> The service would likely accept a hash, not the raw password.
Either everyone would use the same unsalted hashes (bad practice) or everyone would use the same salts (bad practice). In either case, this requires widespread use known bad practices.
If you're not familiar, these are considered bad practices because they allow for the construction of rainbow tables - https://en.wikipedia.org/wiki/Rainbow_table
These are bad because they allow hashes to be reversed.
> If it's rate limited more aggressively than the normal login endpoint, I don't see how it could be any more dangerous.
Given a little bit of time, rate limiting isn't actually that big of an obstacle. Especially if you don't need to compromise a given account and can use a botnet to trawl for accounts in general.
> Anyone on the internet can already query any site with a guess for your username/password - if they're right, the request will succeed. How is that different?
This imagines a central service querying N others with credentials to see if each service knows them.
Imagine compromising one service. Then people are sending you usernames and passwords for hundreds of other services. You won't need to query any other service, the credentials will come to you and the user will have no indication!
You're right! That is a super cool idea. It would let you spot a user re-using passwords and warn them not to.
However, there might be some complexities. This is basically poor-password-handling as a well-intentioned service. It means anyone breaching any of the services the hub knows about would let you see a great many username/password pairings being checked. It would take make passwords as safe as the weakest link across all the services being checked. The odds of N services all being absolutely safe all at once aren't as good as those of a single service. This is why people shouldn't re-use passwords, right?
I can see your heart is in the right place. You're looking to protect users from themselves! That's a wonderful thing, and I'm so glad that there are people like you who care. But the above outcome might not be maximally desirable, you know?
Yeah but I bet the implementation brushes up against things that have been used to come to a conviction under CFAA. I'm no lawyer but it seems like a big liability to me.
Yeah. When setting a password on a service, most people are not knowingly authorizing that service to use their credentials to try to log in to other services...
Due to the fact how little 500 million passwords is out of the entire set of all possible passwords (even with single digit length) the 1 in 1000 false positive rate is extremely high actually.
Sorry to say this, but this is actually worse than just truncating the hash.
Why? Bloom filters are designed to maintain a reasonable false positive rate while still allowing you to add items extremely quickly (high insertion rate).
Since the bad password list is presumably fixed (or almost so) you can do checks faster and with a lower FP rate by just truncating the hashes and prefix sorting the list ahead of time.
I get your point but I don't think it's accurate. If the list contains 500 million hashes (approximately 2^29), then in order to achieve a false positive rate of 0.001, you must store at least a 39-bit prefix of each hash. A Bloom filter can do better -- approximately 14 bits per entry -- because it uses multiple hash functions.
but don't you still have to store/manage the whole 30gb list of hashes? the point of a bloom filter is it can store any arbitrary string in 7-8 BITS depending on how many hash functions you use to test for existence
If you use a secure hash on the banned passwords they will be random (as good as arbitrary), then just truncate them to 7-8 bits and sort by prefix, searching them using binary search. It will be more efficient than a Bloom filter because there can be zero empty space between entries. It creates a structure similar to a Bloom filter but with no wasted space (no empty entries)
A Bloom filter is great when you need a probabilistic hash table that is fairly efficient and can have records added efficiently.
With a fixed list as I described above, the insertion cost is enormous (avg cost of moving 1/2 of the array elements), but the space/speed efficiency reaches the theoretical limit
Your math is bogus. 7-8 bits? If there's one entry, that would give false positives of 1/256, or 0.3%. Even given 1k entries would likely hit the vast majority of entries.
Or, flipping this: every time you add a password, you check that it hasn't been used before. You can only set 256 passwords, total, before the space of truncated hashes is filled.
Bloom filters don't do anything special to change FP rates, a compact list of random hashes is always more efficient than using a Bloom filter until the filter is 100% full
But that's still 12G of memory, since you can't compress it. Is there a data structure that's better optimized for space and probabilistic membership tests?
Cuckoo filters and bloom filters are different in how they handle increased load. As the cuckoo filter increases load, insertions are more likely to fail, and so its time complexity increases exponentially. However, the false positive rate remains the same. Bloom filters can keep inserting items into the filter at the cost of an ever rising false positive rate.
To store 500 million hashes in 860MB, wouldn't you need to truncate them to 13 bits each? That'd give a false positive rate of 1000 in 1000, slightly worse than the Bloom filter's false positive rate of 1 in 1000.
Depends on the data structure and encoding. For instance, another simple/fast/non-optimal encoding would be a 32 bit table of bit flags, which would weigh in at 0.5 GiB. That would give you a ~11.5% false positive rate.
No, it's not the same structure as a Bloom filter without empty slots. It's the same structure as a Bloom filter with only one hash function. Bloom filters use multiple.
PSA for anyone playing around with that haveibeenpwned file: the text list is 30 GB after un7zipping but every line is padded with many trailing spaces (up to 61 chars I think, making all lines equal in length) and the line endings are the Windows ones (\r\n, instead of just a \n). With trailing spaces and \rs stripped the file is 'just' 21 GB and fits onto a RAM disk on a 32 GB machine with no swapping.
This has bit me because I was playing with that file and to save space and fit it onto a RAM disk (to save time) I've converted it to binary and then when converting it back it had a different hash despite being 'exactly the same', only later I've realized it's using Windows line endings and trailing spaces padding (and that I didn't even have to bother with binary, since 21 GB comfortably fits in a RAM disk on a machine with 32 GB RAM while 30 GB didn't with browser, Windows, etc. running).
Troy said it's a quirk of the export, might be fixed in next version of the file but doesn't matter for the compressed 7z size (which is true, I've checked personally with 7z going at it for hours and it really did compress those line endings and spaces down to nothing).
Edit: Also see my other comment about how very overly eager at rejecting passwords this filter might be (unless I've made an error in my reasoning that someone has since pointed out): https://news.ycombinator.com/item?id=17323068
A little confused. I thought a hash algorithm was designed to make every bit of the output significant. Meaning that there is a massive difference between the inputs of two hashes that differ by one character. In that case it is conceivable that pownd input could collide with non-pownd input especially if checking only a subset of the characters in the output.
What I'm saying is I had always thought the point of hashes was to always compare them in their entirety not as subsets of the output characters. This tools seems to suggest (so a crypto novice) that is not a concern.
How would users feel that their exceptional password might collide with a bloom filter to a pownd password?
The usual way to do this is to treat the bloom filter as a cache that can have false positives. So instead of querying haveibeenpwned for each attempted password from the user, you only query if the bloom filter says that the password was found. If the bloom filter says that the password was not found, then that means it is definitively not found in the upstream.
The bloom filter will tell you whether _the hash of your password_ is in the set. It doesn't matter whether your _password_ matches a pwned one, it only matters whether your password has the same hash as one that is pwned. This tool expands on that by simplifying the test such that it gets more false positives, but still has no false negatives.
Imagine that password hashes are six characters, and `abcdef` is the only pwned password we know about. A smaller version of this tool would reject your password if it hashes to anything that starts with `abc` (such as `abc000`, `abcdee`, `abcdef`, etc). With this toy example, we can see that this is deliberately pessimistic -- it only is a "your password might be pwned" indicator.
It's like throwing out anything in your fridge that is past its expiration date, even if you don't know whether it's actually spoiled.
>Imagine that password hashes are six characters, and `abcdef` is the only pwned password we know about. A smaller version of this tool would reject your password if it hashes to anything that starts with `abc` (such as `abc000`, `abcdee`, `abcdef`, etc).
Doesn't that not work very well at all though? If these "abcXXX" strings are hashes, than just because they share a prefix doesn't mean the unhashed password are even remotely similar.
"good-password-123-dog-xxx..." could hash to "abcdeg" and "password" could be the thing that hashed to "abcdef".
You're completely right, but I think you're misinterpreting the usefulness of this bloom filter. It would most likely serve only as the first layer in a password strength check.
Because there is no possibility of a false negative, you know that if the bloom filter does not include the hash you gave it, the password definitely does not exist in the original corpus. In this case, the password is probably at least semi-unique. If you're using this bloom filter in the context of a web application registration process, this would likely be the case for anyone trying to sign up with a strong/unique password, e.g. a random one generated by a password manager.
However, if the bloom filter maybe contains the hash, you would likely fall back to some more expensive strength check, like querying the full HIBP database to see if this exact password definitely exists in there.
Similarity between passwords doesn't really matter for the purposes of this checker - all it's looking for is a _possible_ exact match. If your password hashes to "abcXXX", then there are two possibilities:
- "abc" is in the bloom filter, and so your password _might_ be in the list (depending on what the remaining characters past "abc" are)
- "abc" is not in the bloom filter, so your password _is definitely not_ in the list.
This results in a high rate of false positives (times when a password is incorrectly thought to be part of the list, but it's not) but that's deemed to be an acceptable tradeoff for this algorithm.
Collisions for the entire hash are rare, but this is just a prefix of the hash. I imagine it is significantly more common for a collision in the prefix.
I maintain a library that talks to the Have I Been Pwned password database, which uses a 5-hexadecimal-digit prefix of the SHA1 hash.
Someone filed a PR asking to add a cache for HIBP's responses, and I did some quick math and came up with ~1200 as a birthday-paradox number for the first five digits of a SHA1 hex digest (assuming uniform/random distribution of hex digits).
500,000,000 forbidden passwords/hashes? What exactly is the real benefit of excluding those? It only matters if someone steals your list of hashed passwords AND you have hashed them in some deficient way (poor salting etc) so that the hashes may be "reversed" and used.
That a password has been used by someone else in the past, for some other reason, and that password is now on some database, does not make that password dangerous. What matters is whether your users are using the same passwords across multiple services, services out of your control. But ATM no tech out there would allow you to test for this.
If your system cannot detect someone running 500,000,000 passwords against a single account, you have far bigger problems.
As far as stopping password reuse, I recall seeing a while ago on HN a "joke" tool to do just that, by attempting to log into a bunch of services with the email / pass provided. I believe it was called Evilpass.
I personally prefer the "zxcvbn" method, which is why I wrote a Java library for my company inspired by it called nbvcxz.
Because the hackers have that same list of 500,000,000 passwords so when they steal your password hashes and want to brute force your passwords, they can start with the 500 million most common passwords instead of having trying random strings (there are 208 billion random 8 letter lowercase strings)
> What matters is whether your users are using the same passwords across multiple services, services out of your control. But ATM no tech out there would allow you to test for this.
Checking against a list of known passwords from breaches does accomplish this, albeit with a high false positive rate.
Does it? If I have used the same password at a dozen sites, and none of those sites have ever suffered a public breech, then my password isn't in any public database.[1] So you will also have a high false negative rate, something far more dangerous than false positives.
I don't think I'm alone in using passwords across multiple sites. Every here lives in that glass house.
> and none of those sites have ever suffered a public breech
Yet. That you know of.
> Every here lives in that glass house.
No, some of us use password manager software for exactly this reason - so we don't use the same password across multiple accounts and have a smaller blast radius when/if a password is compromised.
Lol, until your password manager suffers a breech. Even without a full breech of data, if your password manager's password creation algorithm is made public then your passwords are just as open as anyone else's, perhaps more so. Managers are better, but they aren't the home run people think they are.
You realize there are password managers that aren't cloud based? Also, even if cloud based, they offer TFA. Some offer automatic password rotation/update for your sites. I don't see how the creation algorithm being made public would make my data "just as open as anyone else's, perhaps more so"?
If you're looking to build a server-side check for password re-use, I'd suggest using the Have I Been Pwned API. When searching by range[0], you only send the first 5 characters of the SHA-1 password hash, thus preserving the privacy of your users' passwords. Cloudflare has a great write up on how it works[1].
Just making the possible passwords be 10 chars and contain chars out of a set of 62 (lowercase + uppercase + digits) gives me 0.05 sensitivity at 0.00000001 (0.000001%) FP rate.
Edit:
(This is reply to w8rbt below, here since HN has decided I cannot post anymore for next who knows how many hours.)
I'm aware of this and wouldn't treat a bloom filter as the final decision maker if there is a hit but many people seem to act like it is intended to be one and bring up its FP rate as low enough, i.e.:
3. Both of the linked bloom filters don't mention that if there is a hit you should check with https://haveibeenpwned.com/ API.
4. (Of course, this is HN after all!) my comments pointing this paradox/pitfall/whatever out being at negative point scores. I've literally lowered my score for saying "hey, 1 in 1000 FP isn't actually super accurate and almost sure hit is a good hit". And there are one or two green named users who seem to be criticizing this bloom filter with freshly made accounts, exactly because of this BS.
Read the original Bloom Filter paper from 1970. It covers all of this and it's very short. If the test for membership returns 'probably' then you do the more expensive test if you need to know for certain or you don't use one to begin with:
Space/Time Trade-offs in Hash Coding with Allowable Errors BURTON H. BLOOM
Another option if you're able to do a server side check, is to use a library like nbvcxz (Full disclosure, I wrote it): https://github.com/GoSimpleLLC/nbvcxz
If you have a list of passwords you want to exclude, you can load it into the library, or it comes with it's own list of common passwords from leaked datasets.
Does nbvcxz do anything different than zxcvbn? Just curious on if your implementation does anything different/better. (Might be mentioned in the README, which I just skimmed, so apologies if so.)
Hey there. As said, the readme points out the major differences with zxcvbn.
I think the main difference you'd see, is the quality of matches and the scores generated are superior to zxcvbn. I had put a lot of work into fixing some of the algorithms, and adding more than the original to cover all the corner cases zxcvbn didn't support well.
For example, I support finding separators within a password with nbvcxz, so passphrase detection is much much better. I also added support for levenshtein distance calculations for dictionary matches. So common misspelling, or slight alterations to a word will be caught, even if they are not included in the dictionary.
Also, internationalization support is nice for those who need it.
That would make it possible to create a program that would check all the passwords in the keychain without sending data over the web. False positives wouldn't be an issue as the user can decide himself if he wants to update. Currently it still is a bit of a big filter to have on a mobile device - wonder how far it could be slimed down and to have just 99% accuracy.
I just wonder how one could create such a program that the users will have some guarantees that it will not leak stuff.
> I just wonder how one could create such a program that the users will have some guarantees that it will not leak stuff.
You could do what Okta does with its PassProtect [1][2][3] browser extension and use the Have I Been Pwned API along with k-Anonymity [4][5][6].
PassProtect is a browser plugin that makes it easy for people to see in the moment whether or not their password was exposed in a breach. With a real time, as-you-type notification, PassProtect quickly alerts users of possible "riskier" passwords so they can take action immediately and without compromising privacy. By using k-anonymity, PassProtect ensures that your passwords are never seen, stored, or sent over the network during this checking process.
We’ve also made it easy for developers to add this functionality directly into their app or website. By also surfacing related information and breach details, PassProtect promotes security awareness for users while relieving developers of the burden of tracking breaches and maintaining a homegrown tool.
What is a bad password? It's a password that's easy to guess. Nobody is going to guess 500 million passwords. If they get the password hash, just assume they're going to crack it. To protect against this eventuality, I think forcing users to create a new password every 6 months might be a really powerful protection for consumers.
Rather than try to verify if a password is easy to guess, you could offer up strong password suggestions if the user can't create a good one in two tries using basic password heuristics.
Most online accounts are not critical to a user's life if they get hacked; the real danger is in reused passwords. If you generate random passwords for the users, there's no way they're going to reuse it. So that might be better for user security than just fighting to try to get a password that's impossible to guess/brute force.
So my suggestion is to just generate a super difficult password for the user and tell them to save it in their browser's password manager (make sure your site behaves well with browsers' password save feature), and if they forget it, hit that "e-mail me a password reset link" button. After that, implement non-SMS 2FA.
To be fair, most of the issues resulting from forcing password recycling were that people made password systems, incrementing numbers, using the months of the year, etc.
I don't know of any, and I suspect there may not have been any, widely used consumer facing systems running around those lines.
I suspect most of the problems can be mitigated in a scenario where the use gets a new password, rather than getting to create a new password.
But yeah u2f 2fa is an absolute basic limit for serious services these days (pretty much anything else is subject to phishing).
Two factor only prevents someone logging into your current account; it doesn't prevent someone from reusing your password on a different system.
I'm talking about two completely different attacks. One is password reuse, one is not. 2FA does not protect against password reuse, because not every system in the world offers 2FA.
If you use the same password with a service with 2FA and one without, and the service with 2FA gets hacked, the attacker can access the service without 2FA with the stolen password.
2FA is a protection against password reuse only for the sites who use it.
> I think forcing users to create a new password every 6 months might be a really powerful protection for consumers
I _almost_ (not quite, but almost) would rather get my investment/bank accounts hacked than this. I use a password manager, and just counting the ones I used in the last couple of months add up to like 150+ passwords. Total, I have hundreds for my active accounts across various sites, services, devices, etc.
If I had to change them every 6 months, I'd be doing almost nothing else. Could be mitigated with some API integrations where every 6 months my password manager could just update the password on its own, but that will never be standard across everything.
Intercepting TLS on a device running rebranded CentOS 5 (11 year old distro that stopped getting security updates over a year ago) does not seem like a safe idea.
I’d almost suggest a form of passive threat analysis, on the basis that someone occupying a compromised hash is regarded as a weak link (don’t ban the hash, but notice it, and keep an eye on things surrounding it), but even that idea is bound to provoke abuse.
E.G. Right now, Google keeps looking at my browser string, and starts whining and making unwarranted demands when it sees something unexpected.
Oh noes! UR browser string seems goofy!
Time for mandatory 2 factor auth, buddy!
Fuck you, Google. Fuck you, right in your stupid fucking face.
I decide my security stance. I decide what’s dangerous. I decide which risks are acceptable.
That's effectively the same thing. 16 hex characters is 64 bits. Most users have a password with much less than 64 bits of entropy. So for most users truncating the hash to 64 bits provides no benefit.
What is an attacker going to do with the hash? The attacker will try to crack it, to get the original password. As long as the user's password had at most 64 bits of entropy, it's just as easy to crack a truncated sha1 hash as a full sha1 hash. Truncating it has not harmed the attacker at all. The attacker doesn't care if it's a full sha1 hash or a truncated sha1 hash. The attacker just cares about cracking it.
>Why do you think that most passwords are just in the first 64 bits of a SHA1 hash?
I don't understand what you're trying to say. Are you asking why I think most passwords have less than 64 bits of entropy?
>According to one study involving half a million users, the average password entropy was estimated at 40.54 bits.
Great idea. Another tool to enable dickhead sysadmins to enforce even more capricious password rules.
Hint: the more rules enforced, the more predictable the passwords become. Lazy people will find ways to be just as lazy, in a smaller password space. More passwords start to look the same.
These are not any rules. This haveibeenpwned list was made of password lists that were already leaked in hacks before and its author criticizes the 'now add a number', 'now add a symbol', 'now add a capital' rules that result in 'mypassword1_A'.
There's only one rule that should be needed - a long (24+ char) random string. With a long enough password, mixed case, punctuation, etc becomes less relevant.
Nobody can remember a random string that long -- which might be okay, if you assume (which I don't think is unreasonable) that it is written down or remembered by a password app.
Everybody should be using a password manager. The times of it being reasonable to use garbage passwords because you want to remember them is long past.
What's the password for your local machine before you can launch the password manager?
What's the password for the password manager?
What do you do on a shared machine that doesn't have your password manager software installed?
What's the password for the place where you could download your database on a new machine?
Password requirements so complex humans can't remember them is just slang for DIY 2FA where the factors are a weak password and possession of your user's database. You might as well just require actual 2FA at that point and lax the requirements.
I hate password restrictions. They are dumb and make people use more easily guessable passwords, because otherwise the user themselves is unable to remember. It's like, okay, I need uppercase, a symbol, a number, oh well, I'll just use my password, camelCased, with an underscore and my favorite number appended.
Especially for some bullshit service that I have no real interest in having an account for in the first place. Love it or hate it, at least OAuth with Google or Facebook or Microsoft cuts down on that nonsense a little bit.
Does 1 in 1000 false positive mean that 1 in every 1000 passwords not on the list returns a true? If so, that's not good because good passwords vastly outnumber the bad ones so even a small false positive rate makes a positive very very unlikely to be a true one.
For example: let's assume all passwords (including all of the ones on haveibeenpwned list) are exactly 8 chars long and composed of 50 characters.
This gives a lot of headstart to this filter since the way possible passwords outnumber ones on the list is extremely more crushing in reality due to many possible lengths, unbounded length and there being 95 printable ASCII chars to use (even alnums themselves is 62 chars already).
The above script prints 0.012679079642336052, which means (in these extremely forgiving scenario from above where possible passwords are limited to next to nothing compared to real life) only just over one in a hundred positives is a true positive.
This apparently comes up with FP rates for cancer and AV and such (most people dont' have cancer, most exes aren't viruses, etc.).
I might be wrong, feel free to point that out. I'm not a statistician or a doctor, I just had a statistics class as part of my CS degree. :)
Edit: I've found the English terms and explanations:
The right way to think about the base rate is the chance that a random user is going to try a bad password. You made the base assumption that the user tries a random 8-char password, which means the base rate of a bad password is 1 in 100,000. With a FP rate of 1 in 1,000 this means that 99/100 positives are false ones. In other words, if you were right, every 100,000 users would only trip the filter 100 times, and of those, 99 would be false positives.
But actually, probably more like 1 in 10 or 1 in 50 users use bad passwords. If we use 1 in 50, we get that 19/20 positives are true positives and only 1/20 are false positives. With 100,000 users we'd expect to trip the filter 2,000 times, only 100 of which are false positives.
Yep, in other words the prior probability of a user trying to use a 'bad' password will be a lot higher than badpasses/allpasses. The whole point of checking against a list of compromised passwords is because people reuse their passwords.
What you are missing is that the cost of a false negative is over 1000x the cost of a false positive.
The number of false positives don't matter, and getting a false positive matters nearly not at all. 1 in 1000 good passwords are rejected. Good passwords are nearly random, and cheap to generate. That means that average person needs to re-pick a password once every, what, 100 years?
One in thousand means the user is very unlikely to run into this event.
"But it's much more likely than running into a true positive" you might say.
Actually not, since the user runs into a true positive exactly when they choose a very common password. It's not random. If they only choose bad passwords they'll run into true positives evey single time.
On the other hand, that there exist more false positives than true ones is actually great, since this means you can't recreate the true set of passwords given the Bloom filter.
That's true if you're sampling your passwords uniformly from all legal passwords, and that's not an accurate assumption.
That being said, false positives will hopefully outnumber true positives, since not many people should use one of them.
On the use of a bloom, I would instead use a cuckoo filter. Then, since its error rate plateaus when it's close to full, you could get to the same error rate with a significantly smaller filter.
Is it really a problem, though? Having a high false-positive-to-real-positive ratio isn't as bad for this application, since a user can just pick a slightly different password, with no significant negative outcome.