> Our calculations confirm that a relatively short series of truly randomly chosen English dictionary words is secure; many people find these somewhat more memorable. Above we used "In the jungle! The mighty Jungle, the lion sleeps tonight!" The important thing is to choose enough words and to choose them in a random un-guessable way, such as by changing the spacing, punctuation, spelling, or capitalization.
The problem with this example is that the 10 words are not chosen independently. Type "in the j" into a google search box and the whole phrase will appear in the drop-down box. So the entropy for the choice of that phrase is about lg2(37^8) or about 42 bits.
So an approximation of the total entropy is:
Choice of source phrase = lg2(37^8) ~= 41.7 bits
Choose one of the 10 suggestions from the drop-down box = lg2(10) ~= 3.3 bits
Permutation of words = lg2(10! / 2! / 3!) ~= 18.2 bits
Spacing (assume each word may independently be precedeed by a space with probability 0.5)
=10 bits
Punctuation (each word may be independently followed by '!') = 10 bits
Capitalization: independently choose one of {lowercase, camelcase, uppercase) for each word = lg2(3^10) ~= 15.8 bits
Total so far: 98 bits.
Now consider the third option: a mixture of 16 independently-chosen letters, numbers and symbols. Assume most ASCII characters are available (lets eliminate single quote, backslash and $ which cause problems for some web apps) and we have
The point is that "In the jungle" etc can actually be reliably remembered by a large portion of the population, whereas 16 independently chosen letters/numbers/symbols usually can not.
Humans are great at remembering phrases, quotes, etc. Think about how widespread referential humor is, where the joke is just a reference to/quote from another work. That's something the brain is great at. Random or semi-random jumbles of letters? Not so much.
> Our calculations confirm that a relatively short series of truly randomly chosen English dictionary words is secure; many people find these somewhat more memorable. Above we used "In the jungle! The mighty Jungle, the lion sleeps tonight!" The important thing is to choose enough words and to choose them in a random un-guessable way, such as by changing the spacing, punctuation, spelling, or capitalization.
The problem with this example is that the 10 words are not chosen independently. Type "in the j" into a google search box and the whole phrase will appear in the drop-down box. So the entropy for the choice of that phrase is about lg2(37^8) or about 42 bits.
So an approximation of the total entropy is:
Choice of source phrase = lg2(37^8) ~= 41.7 bits
Choose one of the 10 suggestions from the drop-down box = lg2(10) ~= 3.3 bits
Permutation of words = lg2(10! / 2! / 3!) ~= 18.2 bits
Spacing (assume each word may independently be precedeed by a space with probability 0.5) =10 bits
Punctuation (each word may be independently followed by '!') = 10 bits
Capitalization: independently choose one of {lowercase, camelcase, uppercase) for each word = lg2(3^10) ~= 15.8 bits
Total so far: 98 bits.
Now consider the third option: a mixture of 16 independently-chosen letters, numbers and symbols. Assume most ASCII characters are available (lets eliminate single quote, backslash and $ which cause problems for some web apps) and we have
lg2(92^16) ~= 104.4 bits, which wins.