Hacker News new | past | comments | ask | show | jobs | submit login
The Slippery Math of Causation (quantamagazine.org)
124 points by dfee on May 30, 2018 | hide | past | favorite | 23 comments



Since the OP is prompted by Judea Pearl's new book, I'll ask here. There seem to be at least two schools of thought in causal statistics. The first is championed by Judea Pearl [1,2,3] and the other by Donald Rubin [4].

If I want to learn causal statistics, for use in ML, which school of thought would be more useful? I don't mean to prompt any causal flame wars, but it isn't obvious which approach is more useful.

    [1] https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
    [2] https://www.amazon.com/Causality-Reasoning-Inference-Judea-Pearl/dp/052189560X
    [3] https://www.amazon.com/Causal-Inference-Statistics-Judea-Pearl/dp/1119186846

    [4] https://www.amazon.com/Causal-Inference-Statistics-Biomedical-Sciences/dp/0521885884


I studied Pearl and Rubin in college and didn't find them to be particularly antagonistic, though indeed they use somewhat different terminology and emphasize different things.

If you'd like to learn more, just use https://www.hsph.harvard.edu/miguel-hernan/causal-inference-.... It's an awesome resource and free.


Start with Rubin. Potential outcomes are more grounded and concretely interpretable. The epistemology is simple and won't get in your way, if you're inclined to dwell on those things. It also has the huge wealth of well developed techniques. Then move on to Pearl.

(btw this is in no way a "standardized" answer, just my opinion about how to start with the basics)


If there is more than one "school of thought," then there cannot be any real difference[0] between their results: because if there was, it would count as an empirical resolution between them. Everything else is metaphysical paint and should be taken according to taste.

[0] In saying this I'm assuming neither one is outright wrong.


Are you familiar with Thomas Kuhn? I ask because he wrote a book basically arguing that's not how science works at all.

Because the total amount of phenomenon explainable is too large for any theory, you cannot test or even hold entire concepts of "what each theory says" in your mind.

Schools of science form into communities which determine what is deemed "in scope", what are the grounding frameworks and concepts of the theories that explain the foundational phenomenon really well, informally what is out of scope, and the border territory of active research anomalies.

I only bring it up because I think it's one of the few works of genius and insight into how science and knowledge actually works in practice.


>Because the total amount of phenomenon explainable is too large for any theory, you cannot test or even hold entire concepts of "what each theory says" in your mind.

At the far end of "models work very well and the epistemology is solid," physics solves this problem by assigning different people to each phenomenological class (organized by the engineering similarity of the experimental devices needed to probe them) and then using math to check that everyone's individual confirmation of the theory in their area fits in correctly to the bigger picture. As a result even though the frontier of physics is too large for any one person to know, the confirmation of the standard model has been built into an unbroken surface that reaches all the way from the highest energies achieved to chemistry and astronomy.

When you have a theory like that, you can prove mathematically that it is equivalent to other theories. Then, the enlightened can stop arguing about which one is "truer!" If you have two theories that only exist in the form of English sentences (this was true of psychology in Freud's era) I can't imagine what an equivalence proof would look like, even if it would be possible. Fields where you can't formalize anything tend to have philosophies that look more and more like critical theory as you move further and further from pure logic. At the far end, the empiricism is completely phenomenological and the theory is nothing but literature with no predictive power (and as a result, the only way to choose between them is by disguising aesthetic arguments as appeals to this-or-that). I can't think of any fields that didn't look like that in their infancy, and success has usually been associated with progress away from that.


> physics solves this problem by assigning different people to each phenomenological class (organized by the engineering similarity of the experimental devices needed to probe them)

That's an example of how multiple equivalent formulations of the same theory can be useful for different things even though they are ultimately equivalent.


But one way can be vastly easier to get results than another other way. Even easy but wrong ways can be useful, such as Newtonian mechanics. (It's computationally infeasible to predict, eg, robot movements with quantum mechanics.)


Newtonian and quantum mechanics deliver different answers, which means that without approximation they are different theories - not different schools of thought. With approximation (in a certain limit), quantum and Newtonian mechanics are the same, and take the same amount of work to find answers - quantum mechanics yields Newtonian mechanics in that limit, doing "room temperature large scale" quantum mechanics means writing down Newton's laws. In this sense it's most accurate to say that Newtonian mechanics is a particular approximation to quantum mechanics that holds under certain circumstances.

This other school-of-thought business (Bayesians vs. frequentists, pilot waves vs many worlds vs Copenhagen, everything like them) means choosing between different sets of words to describe the same thing. You are left with one of two cases: either they can be shown equivalent (in which case people are prone to keep on arguing over which one is better), or they are different (in which case one is wrong.)


Just to throw in a metaphor, rather that Newtionian and Quantum mechanics, perhaps the schools of thought are like the differing formulations of classical mechanics like Newton Vs Hamiltonian Vs Lagrangian formulations.


There is no upper or lower bound on the minimum length of a proof when you are free to choose the language in which it is expressed. This means that there almost certainly are things can only be communicated in one language but not another, even if the languages are equivalent in some vague limiting sense.

So as a practical matter, you have to work in a better language, because the difference between spending 30 years to learn something and 30,000 years is quite significant.


Should any of you care for a more philosophical treatment of the idea of "cause", then I can wholeheartedly recommend:

Bertrand Russel, On The Notion Of Cause

Proceedings of the Aristotelian Society

New Series, Vol. 13 (1912 - 1913), pp. 1-26

It's also surprisingly funny.

https://www.jstor.org/stable/4543833


Here is a copy of the Russel paper free for all:

https://users.drew.edu/jlenz/notion-of-cause/br-notion-of-ca...

Thanks for calling attention to this - it is interesting (and funny).


This is a form of philosophy specifically known as causal processes.

[1] contains a good overview of this that presents Russel and critics (Salmon et. al). (I prefer that philosophy be taken in context rather than from the mouth of a particular philosopher).

1: https://plato.stanford.edu/entries/causation-process/


Just wanted to throw out a related Quanta article about Sugihara's method: https://www.quantamagazine.org/chaos-theory-in-ecology-predi...

Sugihara's method is a way of quantifying causal relationships in nonlinear systems with attractors that would be cumbersome to model (hence the "equation free" description).


The problem with Causation generally is that we choose arbitrary stopping points for "root causes," and fall into the Fallacy of the Single Cause.

Pearl's Bayesian Statistics and Do-Calculus gives us a way to compute likelihoods around different observed events to give us some more insight into these mechanisms but from my reading so far, never give concrete solutions to when we should determine an action is not relevant.


> never give concrete solutions to when we should determine an action is not relevant

Not relevant to what? What is the specific problem that Pearl does not resolve?


If an catastrophic event occurs as a result of multiple necessary conditions, and one of those necessary conditions (even if not sufficient) is that some human was doing something they shouldn't have been doing, then of course it's that human's fault. Cause follows blame. We basically use "cause" as a synonym for "blame" in that situation.

If there is no human there to blame do we then say that the cause is a coincidental combination of multiple spontaneous conditions, all necessary but not sufficient by themselves.


I don't think that holds up if you reduce to simple examples that are easily analyzed. If a billiard ball is pushed, it is the contact of the thing that pushed it which caused its motion, an equal and opposite reaction by Newton's 3rd law. If you did statistical analysis you'd discover that the chances of a ball moving depend very highly on whether the last ball to end up in a net was matching suit (colored or striped). But that's not a causal connection in the same sense. There's a causality in this level of physics that works perfectly well. It's not a coincidental combination of multiple spontaneous conditions that the ball moved -- it moved because it got pushed, and if it hadn't been pushed it wouldn't have moved.


that seems to be the whole core of those discussions - statistics, as a science, operates only with correlations and has nothing to do with causation. As long as one stays inside the domain of statistics, one can never say whether something is causation or not. Causation is subject matter of other sciences which study specific machinery of various causations.


I would suggest reading up on Judea Pearl’s Caudality, which is a complete treatment of how to integrate a useful notion of causality into statistical analysis. It can be done; it’s just not common practice.


*Causality


So has anyone looked into the issues involved in explaining this level of causation to any AI systems? It seems this might be work, but interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: