Hacker News new | past | comments | ask | show | jobs | submit login
Review: The Book of Why (tachy.org)
148 points by pizzicato on March 7, 2021 | hide | past | favorite | 37 comments



Brady Neal has a great video course on the subject with slides and readings including Pearl and others:

https://www.bradyneal.com/causal-inference-course

EDIT: I should add Brady also publishes his course textbook online, and it's less Pearl-centric than The Book of Why but still covers the complete subject and then some:

https://www.bradyneal.com/causal-inference-course#course-tex...


Awesome!!! Thanks so much for this.


Thank you for the link


> The catch is that, whether explicitly or implicitly, you must make assumptions in the first place about the directions of causality among the variables.

That right there is the headline to me.

Compare this with the blurb in the dust jacket of the book:

> "Correlation is not causation." This mantra, espoused by scientists for more than a century, led to a virtual prohibition on causal talk. Today, that taboo is dead. The causal revolution, led by AI researcher Judea Pearl and his colleagues, has cut through years of confusion about the nature of knowledge and established the study of causality at the center of scientific inquiry.

This all but claims that these new causal tools have found a solution to the problem of "correlation is not causation." But they have done no such thing: there is no new technique offered here for establishing causation in any better or easier way than the RCT of yore.

If you get rid of the warning "correlation is not causation" and focus everyone's attention on all the exciting inferences you can make when you assume causation, I'm worried that the end result is a lot of bad science.


> all the exciting inferences you can make when you assume causation, I'm worried that the end result is a lot of bad science.

But this, is the definition of science: you make models on top of hypothesis that you assume, you see how existing data fits your model, and then you make falsifiable predictions based on your model.

Studying data without any (causal) model of what's happening is just collecting statistically significant trivia. It is research, for sure, but that's not enough to make it science. But hey, at least you published something.


Sure, but what does Pearl bring to the table here? The idea of making falsifiable predictions long predates the "Causal Revolution." The warning of "correlation is not causation" still seems as relevant as ever.


For one it lets you avoid controlling for the wrong variables and causing e.g. spurious correlations by doing so. In fact this is one of the best examples of why a causal model is necessary, because without one you can easily end up with a correlation that doesn’t exist as is illustrated quite nicely in his book.


Do you mean that the new techniques will (1) help you prove which variables should not be controlled for, or that they will (2) help you more clearly describe your causal assumptions, so that you can more easily recognize which variables should not be controlled for according to your assumptions?

If you mean (2), I can't really disagree: explicitly specifying your causal assumptions through a DAG seems like a clarifying step in specifying a model.

If you mean (1), then I must be missing something because I'm not seeing that this set of tools can do that.

My worry is that (2) is mistaken for (1), and that writing down a causal model is conflated with proving that it is true.


For a given causal model it is (1) in my understanding.


But "for a given causal model" precisely means "given a set of statements about what causes what." Those statements must be either proved or assumed.

If they are already proved, they don't need to be proved further per (1).

If they are not already proved, then they are just assumptions and we are talking about (2).


For more on this, you might enjoy this survey of Pearl's work, and its relevance to economics research, by Guido Imbens: https://arxiv.org/abs/1907.07271.

It is long, but through. The conclusion is essentially that Pearl is over-hyped, though my summary does significant violence to the nuances of his argument.


Using «the book of Why», which is a book of popular science, not an academic book, as a reference is a bit troubling though.


A significant number of Pearl's works are cited, not just that one. (See the references section.)


I find it vaguely shocking the false statements that come up in proximity to Pearl's work, like the idea that nobody talked about causality until Pearl. Physicists talk about causality all the time, in relation to the flow of information. Causality has been a central question in econometrics since at least the 50s -- I'm sure other social sciences are the same.


You might enjoy my blog series on Causality where I work through Pearls 'Causal Inference in Statistics: A Primer' using Python:

https://github.com/DataForScience/Causality

</ShamelessSelfPromotion>


I have skimmed the first 2 chapters and I still don't understand how causal inference is a useful tool or how you apply it to anything moderately complex.

Obviously being able to identify causality is useful, but I don't understand how you can apply causal inference and get a meaningful result.

To apply any of the rules regarding DAG structure you first have to have a DAG of events, which seems like it would be difficult to accurately build up.


Wow what an amazing work. Thank you!


Thank you! The only way for me to learn anything is to work through it so this time around I decided to make my work public.

I’m glad you found it useful.


I like the "Ladder of causation":

    Rung 1: Associations, observational data (seeing)
    Rung 2: Intervention (doing)
    Rung 3: Counterfactuals (imagining)
I often go in reverse order—let's figure out the cheapest clever ways to prove ship will sink (imagining). Then if it seems like it might float let's build it and throw it on the pond (doing). Then if it seems to float let's hop on board and see what happens.


> This book dwells on the history of statistics a lot, and statisticians, as the authors would have you believe, are zealots who have conspired to keep causal thinking out of their field right from the start. That is, until Pearl instigated the "Causal Revolution", as he dubs it, the latest and greatest gift to modern science. I have no dog in this fight, but Pearl (whom I assume is the source of most of these opinions put to paper by Mackenzie) often comes across as wildly biased and grandiose. For what it's worth, I doubt that statisticians as a whole are anywhere as malicious or ignorant as they're portrayed in this book.

This is correct AFAICT (I'm not a statistician even though I read a lot of the statistics literature). The strange thing is that I've never seen any obvious benefits to his comments of this nature. In the most generous possible reading, they are a distraction, with a less generous reading being that you can't trust his interpretation of anything.


The “benefit” is that Pearl’s causality is reinventing the wheel, and his approach is a smokescreen for this. Causal inference has been a thing in statistics since the 1930s. There is absolutely nothing Pearl has developed here that makes a practical difference when it comes to actually establishing causality over methods we’ve had for a century. So to prop up his contributions to the causal “revolution”, he attempts to paint a picture where statisticians are a hopelessly backward, regressive group, and hopes the uninitiated won’t know any better. It seems like he has largely succeeded.


Well, I read the book. I like his clear style, he tries to get across what is different about his approach. To be honest, I like probability theory, but never thought much of statistics, so I guess I can sympathise from where he is coming from.

I enjoyed this book much more than any text about statistics I ever read.


I've read this book twice. The first time, I enjoyed it, and as I read it I felt that I understood the gist, at least intuitively. Similar to what you describe, my view was always that statistics is a bunch of tricks, and probability is much deeper. After reading this book, I read another book, a technical book about probabilistic graphical models. As I read the book, I implemented most of the algorithms. I also had to read a bunch of papers from the 80s and 90s to do that. I then decided to read this book a second time, and now I really came to appreciate Pearl's points, and can see why statistics (and probability...) are insufficient, and the need for his do-calculus. I've also been reading much of his 1988 classic, though not done with it. While I'm not there yet (still implementing more papers, and not read his Causality book yet) I can see how his proposed calculus and the work of his students in that area can help do the things he describes in the last chapter. So, the book can be interesting to lay people, and it may entice them to learn more. I think this is the book's purpose, and therefore that it is a success, at least with me.


I think the problem with your comment (and this is why it's necessary to post these qualifications any time Pearl comes up) is that you've bought into Pearl's claims about statistics. Statisticians have been studying causality for a long time.

All you need to do to verify that his claims about statisticians is BS is look at the potential outcomes framework, which was first developed in 1923: https://en.wikipedia.org/wiki/Rubin_causal_model

He's well within his rights to argue that the PO framework has limitations and that his framework is superior. It's unethical for him to claim that statisticians are anti-causality or that they never studied it.


You seem to read too much into my comment, and make it into something partisan. I'm not interested in that debate. My thoughts about statistics and probability were there before I read Pearl's book. Pearl mentions potential outcomes in his book.


The battle between frequentist statisticians and those advocating Bayesian approaches is quite old—-back at least to Wright. Pearl is not inventing a dichotomy but explaining why models are necessary to evaluate causality. Is this genuine progress? Absolutely! Causal Bayesian modeling is transformative.

Every experimentalist and clinician will come away with good from Book of Why even if the tone rubs some the wrong way occasionally. I made this required reading in my human genetics course for grad students. Perfect level. Yes, I got some welcome pusback from bright students, but I know this book will have an indelible positive impact on their depth of thinking about data generation, model assumptions, confounders, interventions, and counterfactuals.


Another good book on the subject. Freely available.

Causal Inference: What If

Miguel A. Hernan and James M. Robins

https://www.hsph.harvard.edu/miguel-hernan/causal-inference-...

https://cdn1.sph.harvard.edu/wp-content/uploads/sites/1268/2...


> "This book dwells on the history of statistics a lot, and statisticians, as the authors would have you believe, are zealots who have conspired to keep causal thinking out of their field right from the start."

I've read the book and have no dog in the fight. IMO this is an uncharitable interpretation of Pearl's position. The authors of the book present the work of many past statisticians on both sides of the causal debate. A few influential people are indeed rendered as almost caricatures, but clearly that doesn't represent the entire field when the authors also dive deeply into the work of other statisticians who explored causality.


In multiple places Pearl castigated the entire field of statistics, in addition to outright character assassinations of long dead statisticians. I think the original author’s interpretation is perhaps too charitable.


Agree with this counterpoint. Sewall Wright for one (and his father) are given great credit. It is RA Fisher who comes in for well deserved flak for his infamous obstinance.


I feel like I’ve tried to read several writings on this topic, mostly by Pearl. I feel like I’m good during the intro and motivation, but once it gets to the meat I’m completely lost. I feel like this is an area that would provide rich value if I could ever understand it.


As asked by the author, here is an example of application in medicine: https://www.nature.com/articles/s41467-020-17419-7


I highly recommend you to start with that one if you are new to causality. It is much more approachable than his other books.


I did a data scientist-focused review on the same book if that's useful: https://medium.com/@rishabhkabra/book-review-the-book-of-why...


Judea Pearl's home page for the book might also be worth posting here: http://bayes.cs.ucla.edu/WHY/


Does Judea Pearl other books overlap with The Book of Why ?


Yes. To me The Book of Why is sort of an approachable summary of his whole career culminating in his work on causal inference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: