Hacker News new | past | comments | ask | show | jobs | submit | tryeng's comments login

An annoyance, yes it is. But I think it's a good feature. Not making mistakes is the most important thing to improve your touch typing speed. If this annoying behavior makes you slow down and stop making mistakes, instead of using backspace without thinking about it, then it's well worth it. That backspace reflex comes very quickly anyway, it's not really something you need to train.


Base jumping!


You can of course do this trick in any base. If we choose e.g. base 2^n for the n-th row of pascals triangle, we can use the following code for getting the n-th row of pascals triangle:

    def pascal(n):
      base = max(2, 2**n)
      row = (base+1)**n
      return [row/base**i % i for i in range(n+1)]
Nice, but hopelessly inefficient. :) You can also calculate a binomial coefficient the same way without any looping construct (the exponential operator does the looping for you).


That's not a good article to make people read.

I find it strange that it doesn't mention lack of salting among the most common mistakes. Also, I didn't think SHA1 broken in any way that makes breaking password hashes easier than e.g. the SHA2 family? I might be wrong, though.

PS: I'm not advocating using anything other than a good PBKDF for hashing passwords.)

Edit: Re-reading the article it seems like lots of BS in there:

Example 1, regarding hashing something several times: "In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." Nah, guessing is only more time-consuming.

Example 2, regarding the same thing: "The first reason is pretty easy to bust: simply add more hardware (or better hardware) and you're good to go." This applies for bcrypt as well.

And for his "attack" on "Hashing a password N times in the form of hash( hash(password) ) * N" you would need a working preimage attack for the hashing function used.

EditN: Rewrite


I think you're looking for something more complex than that post brings to the table. As a community, we're trying to circle our wagons around a simple piece of advice about code that stores passwords: do not write code that stores passwords. Even if your algorithm is secure, your code is likely not. Include your language's best-supported secure password library (meaning one of bcrypt, scrypt and PBKDF2) and ship it.

So that post may be incomplete regarding the technical details, but the critical information is there: Just use bcrypt. (...and use the recommended work factor.) I know hackers hate that sort of thing, but this is really one of those things we just have to drill.


His advice is good, but that's still no excuse for making invalid arguments for what he is advocating.

Edit: In fact, if I hadn't heard of bcrypt before and saw that article, I would probably not trust his advice either.


Bcrypt has a tuneable "cost function", so you get to decide how hard to make it. It's effectively designed to be slow and hard to do in parallel.

The SHA family on the other hand are designed to be fast, (for checksums etc) so it's possible that later SHA algorithms are actually worse than earlier ones for password hashing.

Modern computers can do a lot of MD5/SHA1 every second so even with a salt, one round of SHA1 is likely to be not very good at all.

You can probably find a significantly large X and do SHA1 enough times to make it slow enough today, but for future-proofing you are better off just using an algorithm that is actually designed for such purposes.


I'm not saying that that bcrypt isn't a better choice, it is, but some of the "flaws" he is pointing out in that article are just ridiculous. If he's going to argue for something, he should be using correct arguments.


    I find it strange that it doesn't mention lack of salting among the most common mistakes.
True. I believe bcrypt requires a salt, so you can't forget it. Bringing that up would have strengthened the case.

    In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." Nah, guessing is only more time-consuming.
The article is using that as an incorrect argument for repeated hashing, and goes on to detail why it isn't necessarily more time-consuming because it increases the probability of finding a collision.

    This applies for bcrypt as well.
You can tune the work factor, so in a few years when computers are nearing fast enough to brute force your hashes, you only need to increase the work factor, not re-write all your code. You can't do that with something like SHA. The article could probably be clearer on that.

    And for his "attack" on "Hashing a password N times in the form of hash( hash(password) ) * N" you would need a working preimage attack for the hashing function used.
I don't see why. If there is a probability of a collision existing for a hash, repeated hashing will increase that probability, turning something that has a low number of collisions into a high number of collisions. The more collisions, the easier it will be to find one.


You can increase the number of rounds of hashing as well, without rewriting your code.

I can't argue with your last point, simply because I don't understand it. How exactly does this "turn something that has a low number of collisions into a high number of collisions?"

In my mind, what we're doing is hashing "mypasswordmysalt" n times, and storing the resulting hash, the salt and n in a user table. If the user table is leaked, and n is 3 (for simplicity's sake, it would normally be _much_ higher), can you explain how this could be worse than doing one round of hashing?


Every hash function has a probability of collisions. For illustration, let's imagine a really bad one where every output has another input that results in the same value.

With a single round of hashing, there are two possible inputs A1 and A2 that can produce the final output O. With a sufficiently large number of potential inputs, it will take you a while to brute force and enumerate all possible inputs before you hit on either A1 or A2.

With two rounds of hashing, there are two possible inputs A1 and A2 that can produce the final output O, and two possible inputs B1 and B2 that can produce the intermediate hash A1, and two possible inputs B3 and B4 that can produce the intermediate hash A2.

With three rounds of hashing you end up with something like this:

    C1  C2  C3  C4 C5  C6  C7  C8 
     \  /    \  /   \  /   \  /
      B1      B2     B3      B4
       \      /       \      /
        \    /         \    /
          A1             A2
           \             /
            ------O------
So with each round of hashing you are increasing the number of collisions, meaning you're likely to brute force an input that will hash to O much quicker.

[Edit] Of course, with each round you're also increasing the amount of time to compute O, but given most hashing algorithms are designed to be fast I'd say it's probably not enough to counter it. Not sure though, I've not actually looked at the maths.


[Replying to tryeng]

    No, there is an infinite number of inputs that produce the final output O.
Only with an infinite number of inputs. If we restrict our domain to things that are likely to be passwords, say string under 1000 characters in length, then we're increasing the number of inputs in our domain that can produce O.

    you have to find something that generates O after exactly n rounds of hashing
I was using an increasing number of iterations to demonstrate how each iteration potentially increases the number of collisions. Taking n=3, you don't need to know any of the intermediate states A1, B1, B2, B3 or B4 to take advantage of the fact that in our domain of strings under 1000 characters we only have to find one of 8 possible inputs rather than one of 2.

    I said will be true for all relevant hashing functions.
All relevant hashing functions have a probability of collisions in any useful input domain. Okay the tree won't be as dense as the one illustrated, but you're still increasing that probability by repeated hashing.

You need to find some way to mitigate the collisions, at which point you've basically got bcrypt.


[Replying to ajanuary]

Thank you, now I actually do see your point. I would still not think of it a considerable weakness. To find such a collision would take more time than bruteforcing any likely password.

There doesn't yet exist a single example of any SHA1 or SHA2 collision, and if we use SHA256 as an example, we could probably not find one the next few years by bruteforcing even if we used all the world's current computing power and storage.

Edit: Actually, that whole argument falls to pieces, because if we can search through enough possibilities to find any collision, the output size of the hashing function is too small to for the hashing function to be secure.


Indeed. This is where I agree with to that the article is a bit weak. It overstates the problem of repeated hashing and doesn't explain how bcrypt solves that problem at all. It makes it sound like a completely different and magical solution rather than repeated hashing with collision mitigation.

It's more a case of "hey, here's a potential problem you might not have thought of, here's an algorithm that addresses it."


Then I guess we might agree.

The only advantage I know of with bcrypt over multple SHA2 is that GPUs are very bad at it compared to most hashing functions, so the CPU cost (on my server) and the GPU cost (the crackers' cost) are not too different. (Anyone, please correct me if I'm wrong.)

Off-topic: This exponential reply delay is really annoying.


you guys can click on the [link] right above the post and you'll be able to reply without a delay.


"With a single round of hashing, there are two possible inputs A1 and A2 that can produce the final output O."

No, there is an infinite number of inputs that produce the final output O. And you have to find something that produces O after exactly n rounds of hashing, it doesn't help to find something that produces O after one or two rounds.

Edit: Sorry, didn't see your assumption when I first posted, but I guess what I said will be true for all relevant hashing functions.


Hi, said author here. The particular article was written around 2 years ago (April 2011 to be exact) when crypto was still a fairly new concept to me. As a result there indeed are some flaws with the article. Having said that, some extra explanation can't hurt.

> Example 1, regarding hashing something several times: "In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." Nah, guessing is only more time-consuming.

I'm not entirely sure what you're trying to say with this example. The particular list item was meant as one of the examples why I think people would do it that way. It's not too uncommon that I read some article about a developer doing that because it is supposedly more secure.

> Example 2, regarding the same thing: "The first reason is pretty easy to bust: simply add more hardware (or better hardware) and you're good to go." This applies for bcrypt as well.

Bcrypt introduces a weight/cost (whatever you'd call it) factor who's sole purpose is to prevent this. The higher the factor the slower the process takes. The nice bit about it is that with a weight of N the hashing process always takes the same (due to some bcrypt voodoo that is beyond my knowledge) amount of time. You also can't change the weight since that will result in a different hash being produced (it would be fairly useless otherwise).

Having said that, I agree that the article could've been written in a better way but it will remain as is, simply because I try not to edit articles after I've published them.


Hi, I see that I misread the "In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." as your argument, not an example of a false argument. I stand corrected.

Regarding the cost, bcrypt only increases the number of iterations (exponentially) with increased cost. The operation will take the same amount of time on one specific CPU, but go faster on another. However, because of higher memory usage than SHA variants, GPU implementations of bcrypt don't benefit as much compared to CPU implementations.

We still agree your advice to use bcrypt, though.


Compression itself in a cryptographic protocol is not the issue here. The problem starts when you let an attacker add chosen plain-text before or after the secret in the same compressed and encrypted stream.

Compression before encryption is not a problem if the sender is the only person that decides what is in the message to be sent. Compression doesn't make it vulnerable to chosen plain-text attacks either. Mixing victims's and attacker's data before compression and encryption will leak data, yes.


This isn't true in theory or in practice.

In theory, protocols which fall to attacks when attackers have control of some of the message are said to be vulnerable to "chosen plaintext attacks" (if the attacker only gets 1 shot per message) or "adaptive chosen plaintext attacks" (if the attacker gets many bites at the same apple). Sound protocols don't have feasible adaptive chosen plaintext attacks.

In practice, most protocols can be coerced into carrying some data controlled by attackers. Sneaking some attacker-controlled data into a message is a very low bar for an attacker to clear.

It's true that content-controlled Javascript code makes it distinctively easy for an attacker to spirit their data into the plaintext, but don't let that confuse you. For the HTTPS/TLS cryptosystem to be sound, attackers can't use this property to decrypt the content they didn't add to the message.


I don't believe it is a poor justification. It is a name chosen for this particular set of numbers. In fact, every name we've put on things in mathematics is "by definition". Real numbers, pi, odd numbers, perfect numbers. Someone found that particular definition useful, and others continued to use it.

In the case of prime numbers I believe you cited one of the best reasons of leaving 1 out of the set. If someone needs a name for "the number one and all prime numbers", they can define it.



Nope. Security through niceness and ethics. If everyone was in on it we would have a really great society.


I don't think anyone would contest that good behavior would be good for society. But, it's not a practical expectation, because the probability of everyone exhibiting good behavior is vanishingly small.


That is exactly security through obscurity. If you're relying on people being nice enough to not exploit you (no matter how difficult it is), you have no security at all.

Let's say everyone on HN was nice enough to not use exploits. Might be possible. But then one person does a drive-by exploit, and BAM. Everyone but one person is nice enough to not exploit people.

Just because you wish people were nice doesn't make them nice.


No, it's just lack of security. There's no obscurity involved at all.

(And, if there was any doubt, we should of course not count on people being nice on the internet.)


I don't think your PCI guy is overly strict. It's pretty clear that the intentions of the requirements are what you described. What might have worked, though, is to have virtual machines inside the EC2 instances in your VPC, and use this to filter traffic through a separate virtual machine.

Still, it's unnecessarily complicated and as you say, a resolved issue now. :) The new features announced fits PCI needs quite well. I haven't looked into the IDS issue you mentioned in your first post yet, but I hope it's possible to resolve somehow or get around with compensating controls.

(Disclaimer: I'm no PCI DSS expert, just an unlucky engineer trying to make a compliant system.)


JPEG artifacts on the logo, The arrow images left to "Cloud", "Open", "Social" and "Secure" are not antialiased. The arrow in the search box is below the center of the box, and lots of other minor issues that makes it look really awful.


Kudos for having an eye for detail, and I agree on the anti-aliasing part. Much appreciated. But aesthetic details are just a subset of professionalism, its not professionalism all by itself. The site has different section for intro, pricing, faq, blog and moreover they have a well done short video explaining what Database.com is about. These details are professional enough for a user to stick.


Are they? Like the OP of this thread, my immediate impression was that I was at the wrong site. It just feels like a domain squatting site, really (especially the "database.com" in the upper left. Here in the 20xxs it just isn't that impressive that you got a dictionary word domain, so stop being so impressed with it).


Seriously? Which domain squatting site has so much of information and a video which explains it's service! Just because it's 2010, does not mean that no one can have a dictionary word domain. They probably had lot of money to spend so they got it.

I am not hugely impressed by the artwork of the site but it is definitely not the worst site that I have seen


Many spammers know how to embed a YouTube video (it uses the YouTube player...), and I'm not going to click through to the other information if I think I'm on a cybersquatter page. They presumably spent at least a mid six figure sum* on the domain back in July, which makes the amateurish landing page even more embarrasing.

It's not the worst site I've seen, but I've seen better landing pages designed as weekend projects by people who then post on here with all the appropriate "I'm not a designer...can you give me some help please" questions and get advised to commission a freelancer who knows a bit about colour schemes as well as having the time and skill to add polish. For a major launch from a massive corporation whose products are (theoretically) driven by good UI and a consistent corporate image http://www.salesforce.com/assets/pdf/misc/SFDC_StyleGuide120... it's quite shameful.

*reserve price if you wanted to buy at auction earlier in the year was $800,000


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: