I'm curious to see more about the distribution of questions and answers people had, and how the HN population may differ from the NYT's. There will certainly be self selection bias here, but if you're willing to share how you did with others, please enter it here:
https://docs.google.com/forms/d/17e5BIL0lH8OHsGj89Zdtdl8GeCV...
People familiar with unit testing and test driven development will feel at home with this kind of puzzle. That doesn't mean that they will be less biased in social/political decisions, it just means that this test will fail to prove a point.
Seriously, though, it seems a bit of leap from the existence of confirmation bias to explaining away the public outpourings of US Politicians about their financial crises and foreign policy disasters - in the absence of better data as to just why the given statements were made ascribing this to confirmation bias seems itself open to accusations of confirmation bias! :)
I mean, it also illustrates how training and systematic reasoning can improve these things. Whether explicitly or implicitly, I picked up certain skills and procedures for problem solving (from programming and math contests) that I now use by default. Trying a bunch of examples, coming up with a hypothesis, trying to disprove it, testing edge cases…
This doesn't mean I always use these—at the very least, I have to explicitly jump into "problem solving" mode—but it means they can be useful.
It's still a meaningful difference, and could very well apply to lots of things beyond this kind of puzzle.
That's exactly what I was thinking. I (sometimes) follow TDD, and I applied it to this problem. I made sure to include negatives, 0, positives, and include primes here or there to help avoid issues with multiplication/exponentiation. After a few of these, I felt pretty confident that the rule was simple.
I tried floating point numbers. Also, at 28 decimal places, the test breaks; it's not arbitrary precision. So, technically, the answer isn't simply "any ascending sequence of numbers".
I actually avoided going that far to avoid getting bad data. I was trying to answer the question, "What does the experimenter THINK his rule is?" rather than what will the computer do. Since the computer can't be infinite, it will inevitably fail with overflow, underflow, and such.
I was relieved, in fact, when it worked with negatives and floats in a "safe" range.
I also tested with 1,1,2 and 1,2,2 to make sure that the required increase applied to ALL of the values, not just a specific pair.
I too tried to test if it was only one pair that was significant. However I grew impatient and didn't try to come up with more tests when I thought I had a sufficient answer to explain my most vexing observation (negative, positive, positive out of combinations involving negative numbers).
The observation to brainstorm for ways of proving that a statement is in fact wrong, and exhausting them, is such an eloquent way of wording the hunt for a negative.
The slightly-shorter 0.60000000000000000, 0.60000000000000001, 0.60000000000000002 will break it too, for what it's worth. If you punch those in to a Javascript console, you can see that the FP representation of all three is 0.6.
Yes I did, though not to 28 places as tptacek had done. With 2 to 4 place floats, including negatives, the test worked as expected.
Like others here have said it wasn't a particularly hard "rule" to figure out. Easy to immediately rule out geometric relationship as in 2^y, which didn't leave a whole lot of possibilities to test. For the commenters here, I'd attribute ease of finding the solution to familiarity with the kinds of problems that programming presents.
Which leads to the idea there's value in learning even the rudiments of programming. Logically, it should encourage better problem-solving skills in general. We might think there this has important implications for our educational systems. But I know, that's probably not realistic at all.
I know the question isn't directed at me, but I thought you might want to know. I couldn't try negatives or floating point on the iPhone. The keypad didn't have the option.
Based on the context of the question and the UI of the testing interface, fractions seem unlikely to be an intended part of the question. I likewise wouldn't bother testing unicode U+216x roman numerals.
It refused to accept both fractions and imaginary numbers. I did test negatives and zeros since that was really the only remaining set I could think of.
Most importantly I used about 6 tests (3 right 3 wrong) to come up with the answer and then did another 17 looking for the trick. After all, it couldn't just be that simple right?
The presentation of the problem -- and I know that trusting the problem state is unwise sometimes in cognitive-bias tests, since many such tests are actually designed to be "we said we were asking X but actually meant Y" -- indicated a simple rule, rather than one which would behave differently on different classes of numbers.
So after the tests listed above I felt confident enough to guess.
To talk about a certain fraction of real numbers you have to have a distribution over them. In general we take the uniform distribution if no distribution is explicitly given. That doesn't work for real numbers (it doesn't even work for natural numbers). (See https://math.stackexchange.com/questions/14777/why-isnt-ther...)
If there's no implicit default distribution, we have to pick on. I can pick one where they cover an arbitrary high percentage of real numbers..
Down the rabbit hole of pedantry: we don't need a distribution, just a measure, if we want to talk about how many reals it accepts. The Lebesgue measure is implied on the Reals if none is given, and the computable reals have measure zero.
We can't reasonably talk about a percent coverage, since the Lebesgue measure of the reals is infinite, but as a non-technical description, 'zero percent' is morally equivalent to saying it only covers a measure-zero set.
Exactly. I'm just as 'No' averse as the next person but the 'No' I'm averse to is the one where you make wrong assumptions and it comes back to haunt you afterwards.
I was just writing this out! I believe practice in unit testing is what let me get this question right. I almost submitted n^1, n^2, n^3 before I realized I didn't get any wrong answers and something didn't feel right. I credit this 'instinct' to having written thousands of tests, and trying to make them fail to make sure my tests weren't lying to me.
It's funny. Even though I got the right answer, looking back, I still see many holes in my tests. I only tested positive and negative integers. I didn't even think to try fractions, decimals, hex values, text, etc. and I was already expecting a confirmation bias test.
It's probably not "fair" to say I got it in zero... but I did. :)
Now that the NYT has done it, this puzzle has probably attained enough popularity now that you really ought to change it up a bit now if you're going to run it yourself. Granted, the space of hypotheses as simple as "increasing/decreasing" is pretty small, but your ability to fool people with the first sample run is almost unbounded, so that helps.
Similar story here. I got it in zero because this problem shows up at the early part of HPMOR.
I suspect that the basic idea behind it is about right (people who insist on failures before committing to a theory will probably do "better"). But it seems to me that this test will be best at selecting people who've seen it before and can pretend they didn't (or even remember to ask negative questions when someone asks you to guess three numbers to get the job).
Yeah, I'd like to see other sorts of questions that you can only get right by looking for disconfirming evidence, but that don't have anything to do with choosing a sequence of numbers.
There's a selection bias - those of us who got it right are more likely to fill that out (:
As a result we can't really rely on overall accuracy, but we can break it out by yes/no to account for the selection bias to get a profile for how a HN correct and incorrect differ.
My answer was: "The sequence is of an increasing real variable, where each subsequent value is greater than the preceding value. It's monotonically varying."
When I clicked "I think I know it", nothing happened. I don't want to click their "I don't want to play; just tell me the answer". But it seems like the right answer. I can't answer your form question if it is the right anwer, since I haven't clicked on their link and don't know for a fact whether it is or not.
Although I used the wrong term, it's strictly increasing.
Your last question is "have you seen a test about confirmation bias before?" But the text of this test says that it's about "why no-one likes to be wrong", which means it's pretty obviously about confirmation bias (and therefore that the test-taker should be wary about just confirming their first intuition).
Intersting that the split of correct/wrong answers from the HN crowd is 78%/22% - the exact opposite of the general population : 22% / 78%! The HN community does think different :)
In the sense that they guessed the rule correctly. The correct answer was the dominant hypothesis in my mind before I tried any sequences, and others here report the same.
It just means that this test was described at least several times in different "computer" media over last few years and almost everyone has read about it.
Same thing with all "logic" puzzles.
3 9 27 yes (is it exponential series?)
4 16 64 yes (is it only odd numbers?)
5 7 9 yes (is it any numbers of the same parity?)
6 7 8 yes (is it any set of increasing numbers?)
6 7 6 no (just to confirm that it's x<y<z, and not something like x<=y<=z)
It would be relevant to include a histogram of #yes_answers - #no_answers in the summary, to test whether people are biased towards positive rather than negative tests. I think the raw data suggests that it does although I totally failed to create a histogram in google spreadsheets within 5 minutes.
Edit: I think this is the code that actually reads the numbers the user enters, see [0]
function l(){
var a=h.exec(m[1]),f=null,g=null,n=null;
return a&&(null!==a[1]&&a[1]&&(f=parseInt(a[1],10)),
null!==a[2]&&a[2]&&(g=parseInt(a[2],10)),
null!==a[3]&&a[3]&&(n=parseInt(a[3],10))),
new e(f,g,n)
}
Edit(2): Actually, I'm not so sure that's the correct code at all. They NYT game is capable of parsing floats correctly (e.g. it accepts 1.1, 1.2, 1.3 as a "Yes") so it's not just using parseInt.
var rightWrong = (inputData[0] < inputData[1]) & (inputData[1] < inputData[2]) ? right : wrong;
With a variable declaration on line 545 being
var inputData = [NaN, NaN, NaN],
revealed = false,
right = "<p class = 'g-answer g-yes'>Yes!</p>",
wrong = "<p class = 'g-answer g-no'>No.</p>";
And `inputData` is changed on text input on line 662
$("#g-input input").each(function(i) {
var val = $(this).val();
inputData[i] = $.isNumeric(val) ? Number(val) : NaN;
});
It uses the `Number()` function to convert from the input text to an actual number, so it can convert any number format defined by ES5[1] or ES6[2]. So in ES6 you can use binary (0b, 0B) and octal (0o, 0O) formatting along with exponential (1e-2) and hex (0x, 0X). Binary and octal works for me currently on Chrome 43 OS X.
* The number may have optional sign and digits after a decimal point, and may use exponential notation. Example: (-1.2e1, .0E+0, 1.e-3) => "Yes". As seen in the second and third number here, there may be no digits before or after the decimal point, but both at the same time (i.e., ".0" and "0." parse but not ".").
* If the number begins with "0x" or "0X" it is read in hexadecimal, where the digits a-f may be in either case. Hexadecimal notation must not be accompanied by decimal point, sign, or exponential notation.
* No whitespace is permitted within the numeral, even between the sign and the digits as in "+ 11", but both tabs and spaces may be used before and after the numeral without changing its value. In particular, by using a input of the form "1 " it is possible to make rectangular display empty while still parsing it as number. Note that pressing "Check" leads to the numbers being displayed in the rectangle in exactly the same way as they were displayed in the text box, which may depend on the position of the cursor in the text box.
ETA: Also, you mentioned rounding, but there is also exponent overflow and underflow. The application refuses to parse numbers greater or equal to 1.7976932e308. It parses arbitrary negative exponents fine, but it does not recognize that 1e-324 is greater than 0.
A test engineer walks into a bar. He orders a beer. He orders two beers. He orders 999999999 beers. He orders 1.00001 beers. He orders -42 beers. He orders 1048576 beers...
With your example it's easy to lose the distinction between "would eventually terminate if you had a fast computer and a lot of time" and "never terminates even in theory." Here, it definitely looks like it should always run two iterations to matter what the numbers are (as long as they're finite), but it doesn't.
Zendo is fantastic. Watching people unaccustomed to its sort of problem-solving struggle with and improve their methodologies for being good at the game is really enlightening.
I've given the test to various people since then and never once came across someone who'd guess it right away, or within a short period of time. The breaking time came usually after a few minutes when they gave up and started throwing out random numbers that coincidentally did not meet the rule. Once you hit the first "No", it took a very short time to figure out the rule for almost everyone.
More or less related: when I'm looking for constructive criticism from someone, I'll ask them "what do you dislike about this?" or "what's wrong with this?" instead of "what do you think?"
I tend to get much more interesting and useful feedback this way.
It's surprising how far you can go by such heuristics. I run an IRC bot that uses matches like this (I'm working on a proper solution right now though) to parse natural language queries and I managed to trick few people into thinking they're talking with human. As long as it's ok for 90% of most common cases, people often won't notice.
I've made a robot that screams[0]. That is, just outputs a random string of "AAAAaaa" when its name is mentioned or when somebody else screams (four or more A's).
What is surprising is how basically 15 lines of python implementing these rules, invokes a very real emotional response in a lot of people :-)
It only means non-decreasing in the context of a monotonic function, where the definition, I believe, is that the derivative of the function is never <0.
The comments here suggest people are missing the full significance of this problem. It's not just a cute number puzzle - it demonstrates a profound human weakness that has a deep impact in everything we do.
1. People that think having a gun in the house makes it safer will not try to design an experiment designed to demonstrate the opposite.
2. People who think organic food is better for you than regular food will not try to look for evidence that the two types of foods are equally healthy.
3. An Israeli who believes the area where he lives was uninhabited before 1948 is not going to think about what kind of evidence would contradict that belief.
I'm not saying the views above are incorrect. It's just that we are all guilty of falling in love with our beliefs when they should be mere acquaintances. Hence the quote, "People don't change their minds. They die, and are replaced by people with different opinions." [1]
Funnily enough, I notice confirmation bias quite a bit in a D&D game I am currently DM of. I'm playing with a group of friends who are big into video games, and as a result they consistently seek resolutions to conflicts in D&D by way of what they know from shooters: kill everything in sight. Yes, it's at times a valid answer, but it's not the only one and it's certainly not the most interesting one. The best way that I've seen the confirmation bias dissipate from their thinking is to put them into situations where their bias just doesn't help at all.
Maybe some positive/negative feedback built into the campaign could help; e.g. for each act of benevolence/violence, add/subtract a 'karma' point from some running total, and alter the gameplay as needed.
Playing the confirmation bias game on "playthroughs of the confirmation bias game": it looks like you get that message if you have a 'no' answer, and the word 'increasing' in your guess. ("Increasing by the same amount" is still accepted.) I wouldn't be surprised if other words also count.
Edit: the words '<', '>', 'increas', 'big' and 'larger' seem to count. Looks like they're accepted as substrings, not just words - so 'increase' and 'increasing' are accepted. I could look for a long time, so I'm giving up now.
Neat - as others have pointed out, I feel that being familiar with unit testing would help in this situation. Having negative test cases is just as important, if not moreso, than having "happy path" tests.
This phenomenon has had a profound effect on the history and philosophy of science. There have been entire schools of thought based on verification of hypotheses, and entire movements based on refuting those schools. The most effective strategy in this puzzle(and the one that is unintuitive for many) is to systematically generate alternative hypotheses and falsify them. Karl Popper claimed that this method is actually at the core of how we gain scientific knowledge, and his brand of philosophy of science is the most popular and arguably the most successful today.
I think that the "quick" adjective in the title is purposeful misleading. You are supposed to learn quickly the most general rule, but that is not so easy because there are many possible rules that could fit such a pattern. It seems that you should be rewarded for solving the puzzle quickly and then you fall in the trap.
I propose to change the title to "A puzzle to test your Generalization Abilities", and state clearly that you should try to find the most general rule that satisfies all patterns you can think of. In that case, I would expect the conclusion and results of the experiment to be completely different. So to summarize: the so "quick" adjective in the title has a very strong anchor effect.
Edit: changed for grammar and to express more clearly what I think.
Maybe it should be "a puzzle many people already know the answer to" in which case the conclusion and results are already obviously biased.
I was able to solve the puzzle without testing any numbers at all. Which really skews the relevance of "only nine percent of people saw three 'no's before answering."
I "got lucky" in the sense that my experiences have primed me to recognize that particular kind of question. If the article had actually modified the question framing at all, rather than just copying the existing one, it maybe wouldn't have worked that way.
It's the same reaction I have to the Monte Hall problem: I don't have to think or be clever to get the right answer. So naturally you couldn't effectively teach me anything by simply posing the question, and any information you collect by doing so won't accurately reflect what I learn or how well I think. You'd just be testing topical familiarity.
The question could have read the exact same way, with a different final rule as the answer. Without testing at all, you wouldn't/couldn't know that...and so you didn't solve it..you just guessed...
Cool. The funny thing is I inserted a constraint of my own invention without even realizing it: "Use the fewest number of examples possible." Of course, this meant failing miserably, and was nowhere in the problem statement.
Perhaps that's an additional factor - not exactly confirmation bias, but not unrelated.
Wonder how many people here immediately knew what the rule was?
This is the standard example of to right way to test a hypothesis/theory and the power of Confirmation Bias, testing sequences that are invalid under the theory instead of testing what you think is correct.
While the HN crowd mostly gets this right when framed as a math puzzle, my guess is that confirmation bias is alive and well in high tech just like in any other field. One example:
Young 20s entrepreneur vs. early 50s entrepreneur. Without knowing anything about either person, which startup is more likely to succeed? Even if you have the business plans for both, and you meet both - which one are you going to be more skeptical about as you evaluate which one of them gets funding?
I've heard of intelligence failures to explain the Iraq war, now this author says it was confirmation bias which caused a completely erroneous justification for the invasion of Iraq. I think the author has confirmation bias in too easily using the term to explain government and corporate policy choices which have been based on false justifications.
Math person here. I'm curious to know if anyone used decimal numbers in their tests and if negative numbers were used. The rule is increasing real numbers and one can guess that the rule is increasing numbers without realizing this includes all real numbers and not just integers.
In addition to getting it right did you use an exhaustive set of tests?
I tried a negative series and a decimal series, just in case the rule was increasing natural numbers. I tried very large numbers to see if there was a limit to the rule, and a series that had large contrast in between each element. For fun I tried to see if the app would recognize "pi", "i", or "e", but, perhaps unsurprisingly, it did not.
I would have checked complex numbers for fun but there is no reasonable ordering of the complex numbers in the way there is for real numbers. It's too bad it didn't recognize pi or e.
I tested with negative integers, but didn't think to test with real numbers unfortunately. (Still got it right, but I should've tested that.)
That said, I did make sure to test all the edge cases I could think of. I was actually going to guess (n^1, n^2, n^3) at first, until it failed for (1, 1, 1).
I wonder if somebody tested the rule to be deterministic by entering same number over and over. But I think the rule a<b<c is too simple to invite to use creative tests.
But the problem statement doesn't imply the rule involves ordering.
Likely, it doesn't accept complex numbers (or vectors or matrices) as input as part of the (implicit) spec: in common parlance, "number" tends to mean real number.
The tests with real numbers indicate that the rule is one based on order. So the problem doesn't state this but it is easily deducible when working with real numbers. Hence no need to check complex numbers. In common parlance in mathematical circles "number" tends not to mean real number unless context makes this obvious.
Does laziness have anything to do with the responses? You get a rule that seems to work and so you seek the reward early. It takes effort to prove yourself wrong.
I was trapped by this and guessed it was exponential series n^1,n^2 etc for n starting at greater than 2. While technically true this was not the rule they had in mind.
As in every number in that set is a subset of the larger set of x < y < z. Poor language choice it's not true, yes I was wrong. I am just curious as to how much laziness and not necessarily confirmation bias has to do with the result. If getting it wrong had some kind of penalty or getting it right had some kind of reward ( money etc. ..), how much better would people do then?
People try to make the fit be as tight as possible to the sample data -- the explanation is that simple. I don't buy the explanation provided in the article.
Additionally, this setting is probably too close to usual situations you get in school where there is little to no interaction and negative answers from the teacher are seen as failures by students. (Speaking about education in my country only.)
I got the part about increasing eventually, but I thought the third number also had to be the sum of the other two. I came up with the sum idea after trying (3 6 9), so only 2 tests. The idea that they had to be increasing came later. I don't have it open but I'm pretty sure one of my tests was (1 2 5) which should have tipped me off... in conclusion yes, I'm probably dumb.
I know you're joking, but I think this is important:
Failing the test does NOT mean a person is dumb. The point of the article is that confirmation bias seems to be a fundamental default in the way everyone thinks. Certain people with specialized training in inductive problem solving (scientists etc.) have learned to compensate.
If folks think it's an issue of intelligence, then they might be willing to think "but not me, because I'm smart." (After all, many programmers believe that they are smarter than the average bear). But while programmers are well-trained to think carefully about sequences of numbers, they might be as susceptible as anyone else to confirmation bias in other areas.
I wonder how much games like Twenty Questions plays into conditioning towards this kind of approach, since the implication there tends to be that the fewer questions you ask, then the better you've done.
Like the other comment said, this is not about being dumb. It's about human psychology: you'd rather not be wrong, so you jump at the possibility of being right, and "forget" to take the logical step of actively trying to disprove your rule (1 2 5 can be disregarded because you weren't trying to disprove the rule).
The "Check" buttons weren't enough of a hint for me and I jumped into "This is a numerical reasoning problem" mode. I'd argue that this kind of situational bias is as much a factor here as confirmation bias.
I understand the power of confirmation bias. I believe it to be natural for anyone. It's perfectly normal to seek an explanation that fits the already built cognitive structures, developed through experience. It's unreasonable to simply jump into new paradigms everytime we encounter a new fact. It takes time to prove that it doesn't fit, and then we start looking for new explanations.
However, the test simply required a possible solution. There are plenty solutions and it's absurd to think they have the simplest one. The simplicity of the rule is subjective, in that is evaluated differently by different people. The famous 'simple but no simpler' is relevant here. As long as we were not told to look for the simplest solution, ALL solutions are equally probable. That being the case, I started to with the first solution that popped into my mind. I sticked with that because of my psychological state. Some searched for other solutions.
I don't think that getting an YES was the main driving force. Of course it feels good to get an yes. This is fundamental in human relations. But it's not the whole story. People do not disbelief globar warming because they want to get an YES. The reason is much deeper. Just as many, so many people go to the wave of climate change because it's fashionable, it makes them feel good, accepted , part of the mainstream. Being a climate change denier is being a disident this day (not my flavor of disidentism), and being a disident is not for everyone. And perhaps disidents picking their fight have complicated reasons for doing so.
Having just re-read HPMOR a few weeks ago, I could answer it right away, but there was actually a difference from HPMOR's version, which required three positive increasing numbers, while this allowed negatives.
This looked like a puzzle I had seen before, so I assumed this was the case (testing a few sequences just to verify) and turned out right.
I guess the conclusion is that if a problem looks suspiciously like one you've encountered before, there's a good chance that they are the same or similar. The world is self-organizing, not completely random where you must obsessively second-guess your accumulated wisdom.
There's a puzzle with a doorman, there's a few distracting clues where the answer is actually very simple.
It doesn't offer confirmation bias though, and it took some time to figure it out.
I consider it a very similar test.
So one could actually construct such a test without the confirmation bias part, and then look at how long it takes for people to realize the simple model.
I guessed correctly with only 2 nos. Since there is no penalty for guessing incorrectly here, I felt safe enough with my theory. I might have checked for more nos, if I had to announce my theory publicly (Twitter, comment, etc). However, I also knew about Confirmation Bias beforehand.
You can enter all kinds of crazy random sequences which only have The Rule in common and get a yes, which seemed to be enough assurance. If you're trying to get it to say "no" but failing, is that still confirmation bias? Doesn't sound like it.
I also guessed correctly with only 2 nos -- my real confirmation came from the fact that "-1, 0, 300000000" (or some number of 0s) was correct, meaning it couldn't be any really meaningful sequence.
>In order to prove their point effectively without falling in the same trap they are pointing, they should conduct the same experiment with a random example each time, not one especially created to mislead the experimentee.
This is the point though - we all have preexisting beliefs, and we often don't critically question those beliefs. The [x,2x,4x] pattern injected into our minds by nytimes is playing the part of the preexisting belief. The fact that most people go forward with their theory without properly testing it is the confirmation bias. Totally randomizing the numbers wouldn't be able to make this point because three random numbers usually won't have a clear looking pattern (and hence wouldn't give us an obvious preexisting belief).
> In such problems, it provides more information to test your hypothesis with data matching your hypothesis than with data not matching it.
I'd argue this is often not the case. For example, in the article's Iraq example, looking for evidence that there were no WMDs would have cost very little compared to even a short war. Instead, Bush's administration searched for positive evidence anywhere they could, even from questionable sources.
The purpose is not to see if people use randomized inputs, or even if they're capable of analyzing data. It is to see if people are willing to challenge their own notions. The "misleading" sample is then just right.
Even in real life, no discrete categorization hypothesis should be accepted if you've only seen data matching one of the categories.
Seems busted now - clicking the "I think I know" button does nothing. I thought the answer was:
Let the first number be x. If x is 0, then the second number is 1. Otherwise, the second number is two times the absolute value of x. The third number is 2 times the value of the second number.
So, am I supposed to feel bad if I assumed it was some tricky, hard to figure out function? It just reminded me of those questions on the ACT or whatever and I froze up and got frustrated.
I'm a data scientist, and it relieved me no end that I got this one right: http://i.imgur.com/V5oJ4i4.png I would have had second thoughts about my career choice if I got this wrong :)
The correct approach for any data modeling problem is to think in terms of entropy. Each subsequent approach should minimize entropy, until you reach diminishing returns.
The sequence is not monotonically increasing. It's strictly increasing. If you test [1, 1, 2] or [1, 1, 1] or [1, 2, 2], you'll get "No" answers even though those sequences are monotonically increasing.
You think the definition of "monotonic function" is more relevant to the meaning of "monotonically increasing sequence" than the definition of "monotone increasing" is?
[1,1,2], [1,1,1], and [1,2,2] are not monotone increasing. They're also not monotonic functions.
A sequence f(n) is monotonic increasing if f(n+1) ≥ f(n) for all n ∈ N.
The sequence is *strictly* monotonic increasing if we have > in the definition instead of ≥.
You can draw that contrast (increasing vs strictly increasing), but what I was taught was to contrast increasing functions/sequences with nondecreasing functions/sequences.
A book or paper will make it clear what they mean by "increasing" by using the definition. There, it doesn't matter at all -- they could just as easily coin new words, since they immediately give the full definition. But the people hanging around this thread, telling people who are using a very common definition of "monotonically increasing" that (in paraphrase) "I hate to be pedantic, but you've made a mistake, in that I would have phrased that differently" have failed to contribute anything or to be pedantically correct. There's no case to be made that, if I say a "monotonically increasing sequence" must be increasing rather than nondecreasing, I've made a terminological mistake. This is a term with different definitions in different treatments.
I got so many nos. I can't believe that "Remarkably, 77 percent of people who have played this game so far have guessed the answer without first hearing a single no." That's crazy.
My explanation to the scarcity of "No"s: people are used to seeing in these puzzles mostly sub-types of increasing series: exponential, linear, etc. By the time they had ruled out these sub-types, and had resorted to guessing "ascending", they wouldn't have encountered even a single No.
I predict that if the rule was narrower, like "exponential", much more guesses would have yielded No's.
I think the first-known matching pattern plays a huge role. The original 2,4,8 sequence, for example, locked me in immediately to doubles of the previous number (causing me to test 1,2,4 and 7,14,28 and such). Had it been a different starting sequence (like 3,9,27), I might've based my guesses differently. Same for 1,2,3.
In other words, first impressions really are important.
They don’t want to hear the answer “no.” In fact, it may not occur to them to ask a question that may yield a no.
So the author's obviously never heard of sanity checking, in fact that's the second thing that I always do once I confirm a solution is to confirm it's not a fallacy.
Having said that, my solutions were
-10 -20 -40
-10 -8 -4
1024 1026 1030
Reminiscent of the folding table libertarians with their questionnaire and that political cartesian coordinate chart. "You answered that 'it's wrong to steal', [...psychobabble...], on this science graph it appears you've always secretly been a libertarian, we meet at the cinnabon on sundays".
My process was [1, 2, 3], and then I guessed each number is greater than the last.
It was totally a possibility that they wouldn't apply the simplest rule, but I felt it highly unlikely. This "rule" is a meme of the rationality community, especially given the example, so it seemed pretty likely that it was sequential numbers.
Veritasium, a pretty interesting youtube channel, posted a video on this experiment a while back. I found the discussion afterwards to be more thought-provoking than this article.
Constraints allow for only integers <= 9999999999999999 and >= -9999999999999999 which is interesting considering (-)10000000000000000 through (-)9223372036854775807 are also within the bounds of a 64 bit integer.
It's ironic that these facts are not mentioned considering the article is about confirmation bias.
The attached reading material is interesting, but this question is too similar to problems where you guess the next one in the sequence and none are missing.
A rule where the numbers are increasing does not explain why 3 or 5 or 6 is missing from the sequence in that version of the question that is much more common.
My mathematical logic is rusty, but if I recall correctly, Gödel's incompleteness theorem basically states that it is impossible to solve this kind of question. No matter how many tests you run, there will always be an uncertainty.
An incredibly stupid example is that the rule could be "yes for strictly increasing, OR if one of the numbers is -18273192783127897981." You'll never know.
I understand this is contrived, especially when the test subject doesn't know. But if you do realize this while doing it, it makes the test a little frustrating..
EDIT: I see people are making a connection with unit testing, and the irony is poetic. This is precisely the problem Dijkstra was talking about when he said that "Testing shows the presence, not the absence of bugs."
Thanks for the link! That was an interesting read, but I don't think I understand the premise of the argument.
From the article
... It is perfectly consistent with your previous use of
'plus' that you actually meant it to mean the 'quus'
function, ...
That may be true, but only if you assume I'm not referring to the plus derived from the axioms of principa mathematica. I am. Is it then the question that when I refer to the principa mathematica that I'm actually referring to principa quus? If I then describe axiom 1 can you not know my words are referring to axiom quus?
It seems the only power of this assertion is that language provides no absolute common ground.
This makes my head hurt. Surely Wittgenstein had permission to use Occam's razor? I mean, if it's possible that you actually meant quus_5 then it's equally possible that you meant quus_3 or quus_114, but since you didn't mention that, I will just assume you meant whichever requires the least amount of parameter-guessing from my side. After all, you have written it down for a human receiver, not for an alien from the Quus_8th dimension.
I think the core idea is much simpler than that. It's the notion that falsifiability is the most powerful tool in our arsenal when we try to conceive of theories that explain a certain state of affairs.
The idea you're getting at is that no number of confirming observations can verify a universal generalization -- the reason why we continue to call generally accepted "truths" theories. But we can increase our certainty to the greatest degree possible by trying to test our hypothesis to the greatest degree we can.
Remotely related: I've been interested for awhile in how the same initial terms of a sequence could possibly be generated by multiple rules.
For example, you might have
2,3...
And the rest of the sequence might look like either
2,3,4,5,6...
or
2,3,5,8,13...
or
2,3,5,7,11...
or even
2,3,5,10,20...
Clearly, on some level those sequences are all much less complicated than one defined as "The first term is 2, the second term is 3, the third term is 919243, the fourth term is -1234..."
It's unclear to me how one might rank them in complexity, though. The maximum amount of memory necessary to get an arbitrary nth term? The number of operations necessary to get to the next term?
Another interesting question to me: if there is an ordering of ways to generate a sequence of numbers, given the first couple terms of a sequence of numbers, what are the simplest N ways to generate the full sequence?
Fun fact: Mathemetica has a method FindSequenceFunction which does exactly what you describe. In my experience, however, it generally requires at least 4 terms.
I'm not totally sure how the math behind it works (maybe it's similar to Eureqa?) but the results speak for themselves and are rather incredible.
For example, if I run FindSequenceFunction on this input:
{0, 1, 3, 8, 19, 43, 94, 201, 423, 880}
Which is the number of 0,1 sequences of length n that contain two adjacent 1s
Which, astonishingly, is correct for all the values I've tried. So apparently Mathematica understands more about this sequence than I do, and I know its definition.
Which as far as I can tell, is a closed-form solution (!) to the integral. A solution it worked out to an integral it has never seen, but only the first 10 elements in the sequence.
So it's safe to say Mathematica knows a lot more about math than I do.
Which is the number of 0,1 sequences of length n that contain two adjacent 1s
It sounds a lot simpler than that. There are n-1 places to put the adjacent 1s. For each of those, there are 2^(n-2) ways to complete the sequence with 0s and 1s. So the answer should be (n-1)*2^(n-2).
Edit: it isn't quite that simple, I'm counting sequences with multiple pairs of 1s multiple times.
By the way, Mathematica's formula looks a lot like the closed form for the Fibonacci sequence.
Let a_n be the number of sequences of 0s and 1s of length n, that do not contain a pair of 1s, and end in 0. Let b_n be similar, except that the sequences end in 1. These satisfy the recurrences a_{n+1} = a_n + b_n, and b_{n+1} = a_n. It follows that a_n satisfies the Fibonacci recurrence a_{n+1} = a_n + a_{n-1}. Starting from a_1 = 1, and a_2 = 2, we have a_n = F_{n+1}.
The total number of sequences of length n is 2^n, so the number we want is 2^n - a_n - b_n = 2^n - F_{n+1} - F_n = 2^n - F_{n+2}.
And this time it works :-). There might be a point about confirmation bias here, in that the trick is to count sequences that _don't_ contain a pair of 1s.
One drawback of Mathematica is that it has a very poor idea of the formulae that human readers will regard as simple.
Choose a programming language. Choose a sequence prefix (in your example: 2, 3). Then consider all the programs that accept n as input and output a sequence of n numbers, such that the first numbers are always 2, 3. Now take the shortest of those programs. The sequence it produces is the "simplest".
If this sounds tedious to code, you could easily outsource via Odesk or something.
I think you might enjoy the book Fluid Concepts and Creative Analogies, by Douglas Hofstadter [1]. The first chapter is on (what Hofstadter argues is) the fundamental nature of recognizing patterns in number sequences.
What you want to forbid is not so much mentioning specific numbers, but you want to only allow rules that have certain symmetries. Eg you can require tranlation invariance
rule(x, y, z) = rule(x+offset, y+offset, z+offset)
Tests like this are frustrating even if you don't recognize the logical fallacy! Just because it's always possible you're wrong. You're either right because you're lucky, or wrong because you messed up!
This does not hold - the test uses Javascript's numbers, which are finite. So it's theoretically possible to test every combination of inputs and convincingly answer what a possible rule is.
There is no penalty to being told "no". That also applies to "I think I know the answer". There is no penalty for being told "no" there as well, so once you have a reasonable guess why not check it? Disagree with their analysis.
There is an implied penalty because it says about the answer box "Make sure you’re right; you won’t get a second chance" in contrast to "You can test as many sequences as you want" about the sequences.
This is absolutely correct. Perhaps Hacker News culture is such that they realize such a warning can be overcome through a refresh, or possibly an incognito browser window if a cookie is preventing a second guess. Still though, within the rules of the game, there is no drawback of testing negative sequences, but there's a clearly defined drawback of entering an incorrect answer.
I don't seem to be able to submit my answer or follow the link to see the answer for "just tell me the answer" is this supposed to be a trick or is my browser just not capable of properly following those links? Using Chromebook.
I entered A, B, C. The form didn't let me submit the sequence, so they are doing some verification that the entries are numbers. Rational numbers work, though: 1.1, 1.2, 1.3.
the questions is somewhat ambiguous. it would be more interesting to see what happens of people really understood the question and if they had a real motivation to get it correct.
This doesn't seem to work right for negative numbers. The article says the rule is "each number must be larger than the one before it", but if you try -2 -4 -6 it says that pattern doesn't match the rule.
Maybe I'm just being pedantic here, but last I checked -6 was larger than -4.
It works for floating point numbers -- 0.01 0.02 0.04 for example. So it's a geometric series that has to start with a positive number and doubles. (The submit button and the show the answer button were broken for me.)
Putting your engineer hiring notices under the hood is becoming a common "Easter-egg" practice. It's also a form of targeted recruitment advertising - the only people who see it are your target audience.
"We’ve chosen a rule that some sequences of three numbers obey — and some do not. Your job is to guess what the rule is."
My mistake was to assume that choosing the first number uniquely defines the next ones in the sequence. Since, you know, like all the sequence puzzles I've seen before worked like that, and I didn't read it rigorously enough. Oh, by the way, the doubling thing is wrong if you use negative numbers (wrong as in it gives false positives, instead of just false negatives). But the problem definition doesn't even tell what set of numbers we're operating on.
Finding the rule the sequences obey is impossible since it could be that all cases follow a simple rule except for one triplet which you're unlikely to find. It's trivially easy to fool the user into finding a wrong rule.
Same here - I was also guilty of feeling a bit clever and avoiding the obvious "oh, the numbers double every time" answer. So, when I found out that if the spelling of each number was one letter longer (4 = four letters, 8 = five letters, 11 = six) I got smug, and didn't bother testing further.
That was a hypothesis I considered, as was x,2x,4x. Invalidated both of those, and ultimately drew the intended conclusion (which is not quite "correct", given javascript numeric precision issues).
Most people will make assumptions about what's required based on previous experience. And hardly anyone will have previous experience where a question formatted in this way isn't asking you to find a series rule.
That's not quite the same thing as confirmation bias. With a bias you're just as likely to discount significant evidence as you are to mismodel the problem space.
It is true that, in general, this quiz is impossible to guess, for the reasons you explain.
But in this case -- which is the main point of the article -- it was actually a trivial rule. No tricks, no special cases. The real purpose wasn't for you to find the actual rule, but to learn about your biases. The only trick here lies in the human mind, and its tendency to validate patterns (and claim an early victory) instead of trying to refute them.
Yeah I thought it would be something like "Numbers must strictly increase unless the first number is 15326". Then the point of the article would be that some government rules are not well defined.
Have you considered that you missed the point entirely?
It's hard to see how you consider this an insult and not an interesting "live" experimental result.
"Test your problem solving skills" is merely the experimental set-up and the purpose isn't to actually measure problem solving skill, but rather to demonstrate confirmation bias on a population.
Your alligator tears and perceived insult add nothing to the conversation.
The official answer to this puzzle makes a huge assumption: that there is one correct answer. There is not one correct answer. (x, 2x, 4x) gives you a "yes" every time, therefore it is a correct answer, at least as automatically checkable, and there's an infinite number of such tuples. To find a "no" you're reduced to random guessing. That's not a puzzle, that's crap. The confirmation bias material might be true, but the puzzle does not illustrate it.
It's amusing that you went to a website on confirmation bias, did the puzzle incorrectly, presumably read the material on confirmation bias, but still suffer from the effects of confirmation bias.
Amusingly human, if I may add. Reading "Thinking, fast and slow", by Daniel Kahneman, one key idea that I got was that even knowing against biases you are very, very likely to suffer from those biases. Disheartening results were gathered from studies done on well-trained psychologists and people prepared for the experiment, to no avail. Can't remember the details right now, but just read the book, it's awesome. Another good one was "Influence" by Cialdini, but they gave you tips on trying to avoid those biases that, upon reading Kahneman, I don't think anymore that are very useful.
The puzzle is not "Find a rule that matches everything you tested". It's figure out what rule they are using. It may be true that there are multiple ways to state the rule, such as "x_2 >= x_1 + 1 and x_3 is >= x_1 + 2" but they are equivalent.
Try the sequence: 0,0,0. It would give "yes" for (x,2x,4x), but the actual rule gives it "no".
Your task is to learn a class of objects by example. Without seeing any negative examples, you're unlikely to learn the right class (which is unique, it is precisely the class of increasing sequences of length 3). (x, 2x, 4x) is not "a correct answer", as it does not describe the class you were supposed to learn to distinguish.
It's not an assumption; the article tells you up front that there is one correct answer.
It is an analog for how science works. When it comes to a natural phenomenon, humans can come up with multiple explanations that fit a given set of observations, but presumably (I mean, this is a basic tenet of science) nature only works in one consistent way.
Thus, the importance of a falsifying test. You form a hypothesis based on the initial observations (in this case, the number sequence 2, 4, 8), and then you propose a test that could falsify your hypothesis.
The trick is that a hypothesis can fail in several ways. It can be outright wrong, like saying "the rule is that the numbers decrease from left to right." That's obviously just wrong.
But it can also be too specific, like saying "the rule is that the exponent increments by one with each step to the right." That matches the given evidence, and tests with other base numbers will succeed too. But it's over-fitting.
Here's a concrete example: a man wearing a red shirt drops a weight and measures gravitational acceleration as 9.8 m/s^2. So he formulates a hypothesis that gravity always produces an acceleration of 9.8 m/s^2 in the presence of a red shirt.
And if he always wears a red shirt, and always tests gravity on the surface of the Earth, he'll always find supporting evidence for that hypothesis.
But of course we know that gravitational acceleration varies depending on mass and distance, and that it's the same no matter what color your shirt is. But he would only find that out if he varied his experiment beyond what his hypothesis predicts.
Well, yes, in this situation random guessing would be very likely to provide an immediate check on the solution.
(1x, 2x, 4x), as indicated in the video below, is not sufficient. It represents a subset of the values that are valid.
Think of it this way. When asked to write a unit test, do you only test the positive outcomes? No, you test to make sure the failures are as you expect as well. Otherwise, you are likely to have what you think is a failure end up as a success.
The idea isn't to come up with tuples that satisfy the predicate. The idea is to figure out what the predicate is in the first place.
Also, you're not in any way, shape, or form reduced to random guessing. If you have an idea of what the rule might be, you build a counterexample. There's a ton of value in _trying_ to get a no but getting a yes instead.
From a Computational Learning Theory perspective, we are faced with an infinite hypothesis space, with an infinite VC dimension. So yes, there's not much strategy that can be employed here.
But, the fact that there is one correct answer is not really an assumption that the puzzle makes, it is information we have been given:
> We've chosen a rule that some sequences of three numbers obey -- and some do not.
This simply means that the solution is realisable. No matter how many ways (in English) we have to describe that solution it is still the same solution.
> (x, 2x, 4x) gives you a "yes" every time, therefore it is a correct answer
I think this logic is a bit wonky - if there are sequences that get a "yes", but don't match (x, 2x, 4x) then the correct rule cannot be (x, 2x, 4x), can it?
I've read this comment three times and I still can't figure out what you're trying to say. It sounds like you're arguing that all "puzzles" should exist on finite domains so that a brute force solution exists.
"(x, 2x, 4x) gives you a "yes" every time, therefore it is a correct answer, at least as automatically checkable."
Well, no: you can type 1, 2, 3 into the system and it will tell you "yes", but your rule says that it should tell you "no".
It is crucial to the definition of "correct answer" here that your rule should not just say "yes" only for tuples which the widget also says yes to, but also your rule should say "no" only for tuples which the widget also says no to. That is what the puzzle means when it's asking, "can you guess the rule that we've created?"
This makes it very, very different from what I think you're thinking about, which is situations where someone tells you, "what is the next number in this sequence? 4, 7, 13, 25, ...?" where technically there are an infinite number of rules which will generate those 4 numbers first and an arbitrary number afterwards. Technically one of them is "simplest" in the sense that it can be expressed in 7 symbols, but in general it's a complicated problem and there is no best solution.
"To find a 'no' you're reduced to random guessing. That's not a puzzle, that's crap."
In many ways it still is a puzzle but the space that it lives in is richer. If you think about typical "puzzles" they're things like: "here's a grid with some spaces filled in with numbers,
Each number is a block in a block wall. We want you to turn this into a block maze so that each 'block wall' (set of blocks connected by adjacency) contains exactly one numbered-block whose number says how many total blocks are in the wall. Furthermore the path (non-block space) of the block maze should be connected and should not contain any 'rooms' -- that is, any 2x2 or larger segments of open space."
This 7x7 grid has 10 spaces which are known to be blocks and exactly 12 more blocks scattered in the remaining 39 spaces, so just by those factors alone we're searching only (39 choose 12) ~= 3.91 billion possibilities; we can also use a quick heuristic to identify 6 places which must be "space" to break apart adjacent numbered walls, removing 91% of that search space.
The puzzles, "I have a set of integers where inclusion in the set is governed by a short rule, you can ask me any integer and I will tell you whether it is in my set", by contrast, have an infinite search space. This means that any solution is going to be more interesting, as will the means for checking that solution's validity. You could require, for instance, a Haskell expression of 140 characters or fewer which turns a nonnegative Int named `n` into a Bool, to be judged as "valid" or "invalid" if it properly filters `[1..10000000] :: [Int]`. You could even give the first 100 numbers in the set, e.g.:
In this case that's pretty much enough to see the general pattern; the verification covers 10 million bits while the 140-character limit probably limits your search space to 1000 bits or so, so it's going to be hard to get an "incorrect" answer which agrees on that subspace of the whole.
The result summary is visible here: https://docs.google.com/forms/d/17e5BIL0lH8OHsGj89Zdtdl8GeCV...
The raw answers are visible here: https://docs.google.com/spreadsheets/d/1ZxR2_eOUtNLXwgKfLO1J...