I read this book called "How Big Things Get Done." I've seen my fair share of projects going haywire and I wanted to understand if we could do better.
The book identifies uniqueness bias as an important reason for why most big projects overrun. (*) In short, this bias leads planners to view their projects as unique, thereby disregarding valuable lessons from previous similar projects.
(*) The book compiles 16,000 big projects across different domains. 99.5% of those projects overrun their timeline or budget. Others reasons for slipping include optimism bias, not relying on the right anchor, and strategic misrepresentation.
This goes beyond project estimations. The Metaculus tournaments have questions like
> Will YouTube be banned in Russia before October 1?
These are the kinds of questions most people I meet would claim are impossible to forecast accurately on because of their supposed uniqueness, yet the Metaculus community does it again and again.
I believe the problem is a lack of statistical literacy. Back in the 1500s shipping insurance was priced under the assumption that if you just learned enough about the voyage you could tell for certain whether it would be successful or not. It took a revolution in mindset to ignore everything that makes a shipment unique and instead focus on what a primitive reference class of them have in common, and infer general success propensities from that.
Most people I meet still don't understand this, and I think it will be a few more generations until statistical literacy is as widespread as the literal one.
The binary calibration diagram is what I would focus on. Of all the times Metaculus has said there's an x % of something happening, it has happened nearly exactly x % of the time. That is both useful and a little remarkable!
That doesn’t necessarily imply individual predictions are particularly good. If between two outcomes I say X wins 50% and I pick the winner randomly I’m going to be correct 50% of the time. However, if I offered people bets at 50/50 odds based on those predictions I would lose a great deal of money to people making more accurate predictions.
This is true! The Metaculus community forecasts also performs very well (in terms of score) in tournaments including other aggregation mechanisms though.
It also seems gameable: for every big question of societal importance that people care about for its own sake, have a thousand random little questions where the outcome is dead obvious and can be predicted trivially. Would you know anything talking about weighing questions to account for this?
There's calibration, but you can also just see contests where you pit the community aggregate against individual forecasters and see who wins. The Metaculus aggregate is really dominant in this contest of predicting outcomes in 2023, for example. See this: https://www.astralcodexten.com/p/who-predicted-2023
Trivial questions wouldn't result in a good histogram where a probability of 30% actually results in something happening roughly 1 in 3 times. Trivial would mean questions where the community forecast is 1% or 99%.
Those are not the vast majority of questions on the site. It would be very boring if the site was 70% questions where the answer is obviously yes or obviously no.
Additionally, many questions require that you give a distributional forecast, in effect giving you 25/50/75th percentile outcomes for questions such as "how much will Bitcoin be with at the end of 2024?"
Who would be gaming the system here anyway, the site? Individual users?
I'd also include under 'trivial' things like "Will this 6-sided die roll a 1?", or really any other well-understood i.i.d. process whose distribution of outcomes never changes under reasonable circumstances. Not just things that are 0.1% or 99.9%.
> Who would be gaming the system here anyway, the site? Individual users?
Cynically speaking, users would be incentivized to ask and answer more trivial questions to pad out their predictive accuracy, so they can advertise that accuracy elsewhere for social clout.
> Back in the 1500s shipping insurance was priced under the assumption that if you just learned enough about the voyage you could tell for certain whether it would be successful or not.
Do you have a source on that? That sounds very unlikely to me. People just have to look at a single storm to see that it sometimes destroys some ships and does not destroy others. It is very clearly has a luck component. How could have anyone (at any time really) believe that "if you just learned enough about the voyage you could tell for certain whether it would be successful or not"?
Okay, that was an oversimplification. I have long wanted to write more on this but it never comes out right. What they did was determine the outcome with certainty barring the will of God or similar weasels.
I.e. they did thorough investigations and then determined whether the shipment ought to be successful or not, and then the rest laid in the hands of deities or nature, in ways they assumed were incalculable.
This normative perspective of what the future ought to be like is typical of statistical illiteracy.
I think I got most of this from Willful Ignorance (Weisberg, 2014) but it was a while since I read it (and this was before I made more detailed notes) so I might be misremembering.
There are what looks like fantastic books on the early days of probability and statistics aside from Weisberg (Ian Hacking is an author that comes to mind, and maybe Stephen Stigler?) but I have not yet taken the time to read them – I'll read more and write something up on this some day.
But statistical understanding is such a recent phenomenon that we have been able to document its rise fairly well[1] which is fascinating in and of itself.
[1]: And yet we don't know basic stuff like when the arithmetic mean became an acceptable way to aggregate data. Situations that obviously (to our eyes) call for an arithmetic mean in the 1630s were resolved some other way (often midrange, mode, or just a hand-picked observation that seemed somehow representative) and then suddenly in the 1680s we have this letter to a newspaper which uses the arithmetic mean as if it was the obvious thing to do and then its usage is relatively well documented after that. From what I understand we don't know what happened between 1630 and 1680 to cause this change in acceptance!
Somewhat related, but very detailed, is the book Against the Gods[0] by Peter Bernstein that documents the historical development of understanding probability and risk. It also discusses what people believed before these concepts were understood.
I know you're coming from a good place, but the 'most people I meet don't understand this' line about statistics is quite arrogant. Most people you meet are fully capable of understanding statistics; you should do a better job of explaining it when it comes up, or maybe you are the one who misunderstands. After all, most statisticians thought Marilyn vos Savant was wrong about the goats too...
You don't have to be on the internet for long to see:
- "Polls are useless, they only sampled a few thousand people"
- "Why do we need the crime figures adjusted for the age/income/etc groups? Just gimme the raw truth!"
Have to say, I think stats are the least well taught area in the math curriculum. Most people by far have no clue what Simpson's or Berkson's paradoxes are. Most people do not have the critical sense when presented with stats to ask questions like "how was the data collected" or "does it show what we think it shows".
I just don't see it, tough ironically I don't have stats to back it up.
You don't have to be on the Internet long to see flat-earthers or any number of asinine ways of thinking. You can't stretch discrete observations from a supremely massive sample size into "most people".
Gee, if only there was some kind of rigorous and well understood process for determining how to transform discrete observations according to how representative they are, such that we could build a model for a larger population.
Something like that would be very useful for political decision, so perhaps we could name it after the latin word for "of the state"…
Also, etymonline makes a pretty convincing case that statisticum refers to the behavior of administrators, not to the concept of the administration, with the -ist- specifically indicating a person.
It's definitely arrogant. But after much experience trying to explain these things to people, I'm more and more convinced it's not just "if you just put it the right way they will understand". Sure, they will nod politely and pretend to understand, and may even do a passable job of reciting it, but once its cast in a slightly different light they are just as confused.
Much as with reading and writing, I think it takes an active imagination and a long slog of unlearning to trust logic (the "ought to" thinking that shields one from reality) and coming to terms with the race not being to the swift, etc, and that these effects can be quantified.
It's not that some people are incapable of it. Much like literal literacy has reached rates of 99.9 % in parts of the world, I'm convinced statistical literacy can too. But when your teacher is not statistically literate (which I hypothesise they are not, generally speaking), they will not pass that on to you. They will not give you examples where the race is not to the swift. They will not point out when things seem to happen within regular variation and when they seem to be due to assignable causes. They will not observe the battle against entropy in seating choices in the classroom. They will not point out potential confounders in propensity associations. They will not treat student performance as a sample from a hypothetical population. They will not grade multiple-choice questions on KL divergence, although that would be far more useful. I could go on but I think you get the point.
Yet to be clear, I'm not talking about just applying statistical techniques and tools. I'm talking about being able to follow a basic argument that rests on things like "yes I know they are a fantastic founder but startups fail 90 % of the time and so will they" or "if the ordinary variation between bus arrivals is 5–15 minutes and we have waited 20 minutes then there is something specific that put the bus we are waiting for into a different population." These are not obvious things to many people.
This is not a personal failure – it is a lack of role models and teachers. I wouldn't have considered myself statistically literate until recently, and only thanks to accidentally reading the right books. I wouldn't even have known what I was missing were it not for that!
I suspect it will take a few generations to really get it going.
If someone would donate me large amounts of money I would love to actively research this subject, come up with reliable and valid scales to measure statistical literacy, and so on. But in the meantime I can only think in my spare time and sometimes write about it online.
What resources would you recommend for someone who wants to improve their statistical literacy? You mention reading the right books, I'd appreciate it if you could give a short list, if you have time.
I am not the person you asked, but I have been on a similar path to improve my statistical literacy. For context, I am fairly good at math generally (used to be a math major in college decades ago; didn't do particularly well though I did graduate) but always managed to get myself extremely confused when it comes to statistics.
In terms of books: there are a few good ones aimed for the general public, such as The Signal and The Noise. How to Measure Anything: Finding the Value of Intangibles in Business is a good book of applying statistical thinking in a practical setting, though it wouldn't help you wrap your brain around things like the Monty Hall problem.
The one book that really made things click for me was this:
Probability Theory: The Logic of Science by E. T. Jaynes
This book is a bit more math-heavy, but I think anyone with a working background in a science or engineering field (including software engineering) should be able to get the important fundamental idea out of the book.
You don't need to completely comprehend all the details in math (I surely didn't); it is enough to have a high-level understanding of how the formulas are structured at the high level. But you do need enough math (for example, an intuitive understanding of logarithm) for the book to be useful.
I second both The Signal and the Noise as well as How to Measure Anything. I also mentioned upthread Willful Ignorance.
I think perhaps the best bang for your buck could be Wheeler's Understanding Variation -- but that is based mainly on vague memory and skimming the table of contents. I plan on writing a proper review of that book in the coming year to make certain it is what I remember it to be.
I think the earlier works by Taleb also touch on this (Fooled by Randomness seems to have it in the title).
But then I strongly recommend branching out to places where these fundamentals are used, to cement them:
- Anything popular by Deming (e.g. The New Economics)
- Anything less popular by Deming (e.g. Some Theory of Sampling)
- Moneyball
- Theory of Probability (de Finetti)
- Causality (Pearl)
- Applied Survival Analysis
- Analysis of Extremal Events
- Regression Modeling with Actuarial and Financial Applications
The more theoretical and applied books are less casual reads, obviously. They also happen to be the directions in which I have gone -- you may have more luck picking your own direction for where to apply and practice your intuition.
Edit: Oh and while I don't have a specific book recommendation because none of the ones I read I have good opinions on, something on combinatorics helps with getting a grasp on the general smell of entropy and simpler problems like Monty Hall.
Not my experience at all. Just one example: try talking to physicians about false discovery rates; even those who do not profit from state-of-the art screening methods. Incorporating Bayesian methods is even a struggle for statisticians. It is a stuggle for me.
The Monty hall problem is a great example of something I’ve been educated into believing, rationalizing, whatever you want to call it…but I would still never claim I “understand it.” I think that’s maybe the source of disagreement here, there are many truly unintuitive outcomes of statistics that are not “understood” by most people in the most respectful sense of the word, even if we’ve been educated into knowing the formula, knowing how to come to the right answer, etc.
It’s like in chess, I know that the Sicilian is a good opening, that I’m supposed to play a6 in the najdorf, but I absolutely do not “understand” the Najdorf, and I do think it’s fundamentally past the limit of most humans understanding.
This is not at all true, and I think it's an example of what statisticians have to fight against in order to explain anything. Most people have an almost religious belief that inferences drawn from statistics should be intuitive, when they are often often extremely counterintuitive.
> After all, most statisticians thought Marilyn vos Savant was wrong about the goats too...
This is the opposite of the argument that you're making. Here you're saying that probability is so confusing and counterintuitive that even the experts get it wrong.
My point was that the experts were blinded by arrogance.
Even back then most ( almost all? ) statisticians were capable of understanding the monte hall problem. Yet they just assumed that a woman was wrong when she explained something that didn’t match their intuition. Instead of stopping to think, they let their arrogance take over and just assumed they were right.
My experience is that most people don't understand statistics and can be pretty easily mislead. That includes myself, with my only defense being that I'm at least aware statistics don't intuitively make sense and either ignore arguments based on them or if absolutely necessary invest the extra time to properly understand.
Most people either don't realize this is necessary or don't have the background to do it even if they did, in my experience.
I'm not surprised by the statement 'most people i meet don't understand this'. More than 50 percent of people I meet are less educated. Its statisticly evident.
I might be over reaching but in fact what comes across as arrogance is just an example of statistical illiteracy.
Most (>> 50 percent of) people are very good at detecting patterns. People are very bad at averaging numbers of events, because the detected patterns stand out so much and are implicitly and unconsciously exaggerated.
An example. "in my city people drive like crazy" in fact means: this week i was a lot on the road and i saw 2 out of 500 cars that did not follow the rules and there was even one honking. It 'felt' like crazy traffic but in fact it was not.
The vast majority of people are not that great at statistics. Even something as banal as "correlation is not causation, and can often be explained by a common cause" will blow a fair few minds (e.g. telling most people that old chestnut about how the murder rate is correlated with ice cream sales).
For forecasting with no agency involved this makes sense, but when you’re executing a project things get trickier. The hard part is finding appropriate reference classes to learn from while also not overlooking unique details that can sink your specific project.
I think a lot of modern US public works fall prey to politicians who think the objective of the project is the spending of the money. That is - they push for spending on transit so they can talk about how much, in dollars, they got passed in transit funding. The actual outcomes for many of them are, at best, inconsequential.
Further cynicism could be layered in if you consider some of the blocks of donors (infrastructure contractors / RE devs / etc) and blocks of voters (union transit workers & construction workers) who are recipients again of the spending but not the outcomes.
Finally a lot of the problems come from a long gap of not doing capital projects and so hollowing out of state capacity which has been outsourced. If you outsource your planning, they are less incentivized to re-use existing cookie-cutter plans for subway stations. If you outsource your project management, they are less incentivized to keep costs down. ETc.
Personal suspicion is that real leadership is dull and thankless (like so many things in life).
Announcing a big new transit project is exciting. Actually running the program well requires a lot of boring study, meetings, and management of details. Why bother, if the voters don't punish them for not doing it?
You can find endless internet posts by people complaining their manager doesn't want to do the scheduling of employees, which is the most basic part of their job. It's too tedious, so they try to avoid it.
> Further cynicism could be layered in if you consider some of the blocks of donors (infrastructure contractors / RE devs / etc) and blocks of voters (union transit workers & construction workers) who are recipients again of the spending but not the outcomes.
It's probably best to think of an initial budget as a foothold, and that its ideal amount is low enough to be approved, but high enough to prevent the organization from changing direction after realizing that it's not going to be nearly enough i.e. to think in terms of "pot-commitment."
This book is by Bent Flyvbjerg, the lead author of the paper being discussed here. He’s done a lot of great scholarship on how megaprojects go haywire, and uniqueness bias is definitely a big piece of it.
You especially see this bias with North American public transit. Most of our transit is greatly deficient compared to much of the rest of the world, and vastly more expensive to build (even controlling for wages). But most NA transit leadership is almost aggressively incurious about learning from other countries, because we are very unique and special and exceptional, so their far better engineering and project management solutions just wouldn’t work here and aren’t even worth considering.
> 99.5% of those projects overrun their timeline or budget.
This does not shock me. In the corporate world, plans are not there to be adhered to, but only to give upper management a feeling of having tightened the rope for those pesky engineers who wanted to work at a lazy pace. Such feeling usually vanishes as soon as reality kicks in.
I dunno, upper management normally move onto greener pastures long before reality comes crashing down. That does not happen until two managers over. But that's okay the new plan will fix everything.
If a project is projected to be finished in 6 months, the current manager will still be there, and the success or failure will reflect on their record. It can only go wrong and reflect badly on them.
If a project will take 3 years, the manager can already collect their points for initiating a project with an incredible business case and innovative approach, leave after 18 months, and after a further six month, the new manager can say 'wow, my predecessor left a big mess, I'll clean it up/kill it'.
> Others reasons for slipping include optimism bias, not relying on the right anchor, and strategic misrepresentation.
Optimisim and misrepresentation are WAY more important.
Most engineers I know of were a bit optimistic to other engineers. This leads to slippage because subtasks have a finite amount they can come in early but an almost infinite amount of time they can come in late.
In addition, most engineers are acutely aware of what they think the project would take vs. what number management was willing to hear to launch the project.
Combine both of these and your project will never come in even remotely close to the estimates.
And sometimes the reality is that the realistic answer to a time estimate is "If we are lucky it takes me 30 minutes, if we are unlucky 30 days".
E.g. when it turns out to your surprise, that a part you had in your hardware design was replaced with a part that was 5 cents cheaper, but uses a undocumented protocol that someone has to re-implement, so a goal that was trivial in theory has suddenly involves hardcore reverse-engineering in a high pressure environment. A thing all engineers love.
The only time someone can give you reasonably accurate estimates is when they do something that down to the tiniest detail they have done before. The problem with that is, that in software things change constantly. A thing that was trivial to do with library X and Component Y of version 0.9 might be a total pain in the rear with Library Z and Component Y of version 1.0.
But yeah, unexperienced engineers are going to be optimistic that it is possible, because in theory it should be trivial.
> And sometimes the reality is that the realistic answer to a time estimate is "If we are lucky it takes me 30 minutes, if we are unlucky 30 days".
“I'll put that in Project as 60 minutes, then if you are luck you've got double time for contingency.” -- the external consultant acting as project manager.
Been there before…
Never give a best case estimate, or anything close to, even when quoting a range. Some will judge you, or worse make plans around you, based on that and little else, and it *ahem* isn't their fault if things overrun.
>In short, this bias leads planners to view their projects as unique, thereby disregarding valuable lessons from previous similar projects.
I once had a discussion. We had some managers that would systematically over or underestimate their projects, mostly underestimate. I suggested that we take into account the estimation accuracy of previous projects for that manager and adjust their estimate.
They said that each project is too unique to do this. But I saw the same optimism or pessimism playing out repeatedly for the same managers when looking at the numbers.
Although tbf I think accurate estimation can be bad if the person managing the project knows about the estimation. Since if they have more time, there's less pressure, and they'll have overruns again, making the estimation inaccurate again.
Hofstadter's law, to me, is less about accurate estimation and more about human psychology. If you know you have more time, you waste more time.
This is also a failure-mode in agile project management. If you don't have strict deadlines, it's easy to fall into infinite iteration, because there's always something that could be done better.
Uniqueness bias is definitely a problem (see also: This time is different!) and this paper's conclusions make a lot of intuitive sense. Less competent planners are more likely to see their situation as unique.
However, the measures to establish that perceived uniqueness != uniqueness seem a little bit limited to me. I've seen projects go belly-up because senior management refused to accept that the situation on the ground at one location was different to the usual situation in most other locations, leading to cost blow outs and project failures. I don't think the treatment in this article would pick that dynamic up.
So while it is true that exceptional engineering projects (paper's example - the Apollo project) are much more mundane than they might seem, it is also true that seemingly mundane projects can be unique and therefore it is difficult to compare with other projects of a similar nature.
>>management refused to accept ... that the situation was different
Yes! Seems that the compliment of Uniqueness Bias is Normalcy Bias. Two cliffs to fall off of on opposite sides of the path.
Makes me wonder how a similar study on projects failing due to Normalcy Bias (failing to recognize the uniqueness of a situation) would turn out, and in particular, which bias produces more and bigger cost and/or time overruns. I.e., which is the least-bad side on which to err?
Or that people who don't know many scenarios of the past cannot draw upon similarities or see the connection without experience and thus conclude their situation is unique.
I found the textbook example interesting, where the co-author with experience thought it would take 7+ years, and everyone else thought it would take about 2.
That makes a lot of sense to me, because I wrote a textbook in about 2 years. But...
There are textbooks and there are textbooks. My book covers exactly what I need for my class, has no publisher (students download it from the class website), and doesn't have a lot of the bells and whistles that a publisher would demand. I can easily see how a "real" textbook would take over 3x as long.
So part of the lesson of the authors' textbook is a standard software development lesson - there's a huge gap between a working prototype and a shipping product, and even if that prototype does everything you think the product needs to do, (a) you're wrong, and (b) you may be less than halfway to the point where you can ship.
> Sanbonmatsu et al. (1991) studied consumer choice between multiple brands and products. Experiments demonstrated that when deciding what to buy, consumers search for unique features instead of common ones among alternatives before making their choice. Bolton (2003) further found that even when decision makers are forced to consider more alternatives, including counterfactuals, decisions are still biased towards the initial scenario, which is typically based on availability bias and the inside view. Again, this supports our interpretation that uniqueness bias is likely the outcome of information processing by decision makers rather than an expression of hindsight bias.
This seems like a far-fetched conclusion to me. Neither of these two studies seem to suggest that people do not significantly overestimate uniqueness of projects in hindsight.
The 2003 study seems completely unrelated. How does bias torwards the inital scenario imply a uniqueness bias? That would require uniqueness to be the initial scenario. If anything, I would claim the opposite.
The 1991 study may imply that people tend to focus on unique aspects. Claiming that this tendency in product choice carries over to project management is a strecht imo. And even if it did, that would still not be a strong argument for the conclusion.
It seems based on questionnaires sent out about projects, and they've found some correlation between "uniqueness of a project" (unclearly defined) and cost overruns. Because "uniqueness" is unmeasurable, they can provide various examples in prose where both arguments and counterarguments for uniqueness seem reasonable. So this isn't really science.
But the overall argument; that uniqueness bias (in so far it exists) causes managers to be less critical, seems reasonable.
Isn't it socially healthy to have that bias though? Like working at a start up is statistically unlikely to pay off in a huge way, but ideally you want most founders to think they can do that so they'll make the attempt?
The book identifies uniqueness bias as an important reason for why most big projects overrun. (*) In short, this bias leads planners to view their projects as unique, thereby disregarding valuable lessons from previous similar projects.
(*) The book compiles 16,000 big projects across different domains. 99.5% of those projects overrun their timeline or budget. Others reasons for slipping include optimism bias, not relying on the right anchor, and strategic misrepresentation.