Hacker News new | past | comments | ask | show | jobs | submit login
The Mythos of Model Interpretability in Machine Learning (acm.org)
67 points by alanfranz on July 19, 2018 | hide | past | favorite | 27 comments



Its funny, and characteristic of that part of the field, that mathematical provability in not even mentioned explicitly.

A model (and a learning algorithm) should be interpretable if we can clearly state the assumptions, and prove that under these assumptions, we get what we state we get.

We can predict the movement of the planets (short term). This is interpretable, because we only assume a model of 3d space, and Newton's laws(1). The rest is mathematics, which gives us elliptic trajectories, etc. The resulting predictions may be wrong, due to wrong assumptions, but the model is still interpretable.

Linear regressions, histograms, decision trees(2) are all interpretable. We know exactly what they do, under proper assumptions. Yes, these models can be sometimes manipulated. But we know this precisely because we know exactly what they do. Whether we should use these models, or statistics in general, in courts etc, is an ethical decision, and perhaps practical decision, similar to whether we should use the death penalty. Has nothing to do with interpretability.

With CNNs, at the moment we have very vague understanding of both the assumptions and the models. When we do have this understanding, we will likely also have better and simpler models.

Mathematics was always a tool for understanding reality. Proving is understanding. And it was pretty successful so far.

(1) and some other things -- the rest of the planets are far enough, etc. (2) given enough data wrt the size of the tree, under proper distributional assumptions.


I don’t agree with this. For example, you can set up a collection of assumptions to underpin frequentist statistics, and then create a set of theorems about consistent or unbiased estimators, and develop a theory like that of p-values.

But then in a practical setting, the model doesn’t correspond to something physical or to the inference goal of a practitioner. The p-value tells you something about the relative extremity of a certain statistic that, under certain assumptions, will have a particular distribution.

The practitioner wants to know the posterior probability of a particular model or hypothesis given the data, and the frequentist outcome literally can’t comment on it.

In this sense I think being able to state assumptions and connect them to outcomes with theorems is good, but not always necessary or even sufficient for applied work.

And there can creep in nasty subjective aspects of the problem that are uniquely defined by the specific inference goals at hand. Proofs about how a model would behave under assumptions are often totally useless in these cases. Practitioners don’t use linear regression for complex financial models because the assumptions hold or because of nice properties if the assumptions held. They use them because it’s simple and easy and sort of “just works” despite glaring flaws.


I don't think there's a contradiction. My comment was about what it means to understand the model, or perhaps about what it means to do science, but not about what is necessary or sufficient for applied work.

But if we do talk about that though. Yes, science proceeds by modifying or sometimes totally abandoning the assumptions. No silver bullet, and the point is to assume as little as possible.

In general, if things "just work", its not a reason to abandon attempts to understand them. Things "just work" until they don't, especially in finance.

Consider two guys. Guy A, he predicts that the sun rises every day and is there until dusk. It "works", and so, as the joke goes, he does not touch anything. And guy B, he knows physics and has telescopes. And he can predict the solar eclipses. He knows the failure modes of the model of guy A. Consider the difference between the two.


I feel like you’re arguing against yourself here. Guy A’s shallow model would easily be considered more interpretable than Guy B’s sophisticated astronomical physics model, though Guy B’s model is clearly able to articulate more detailed predictions about what might happen. Guy B’s model is less interpretable, but because it “just works” in a vastly greater number of ways (can be applied to the moon, constellations, other planets), people use it or care about it.

If all that astronomy offered was a more verbose description of why the sun rises, yielding literally zero different predictions from Guy A’s super simple model, no one would care, and might call Guy B a witch!

I’d argue that what matters for science is pure predictive efficacy. That’s it. If you can explain something, it means you can accurately predict something about an unknown state of affairs that would falsify your model if your prediction is wrong. That’s it. Any other kind of explainability is just a matter of linguistic convenience.

Of course there is overfitting, etc., but that’s just part of refining the model to yield greater predictive accuracy on unseen data (less generalization error). It’s still all about putting your predictions where your mouth is.

If the differing theories of Guy A and Guy B cannot be separated by actually testing the predictions they make, then there is no “interpretability” — just linguistic hand waving.

Incidentally I think a lot of modern focus on interpretability is actually about how to be a political gatekeeper or taste-maker through linguistic hand waving, and is not about developing models who better survive the rigors of being required to make real predictions about states of affairs.


>But then in a practical setting, the model doesn’t correspond to something physical or to the inference goal of a practitioner.

How does this invalidate the notion of interpretability?

A model can be both wrong/misapplied and fully interpretable.


> “How does this invalidate the notion of interpretability?”

Who said that it did?

I’m saying the development of a model that yields scientific progress doesn’t have to have a connection to the parent comment’s proposed definition of interpretability, and may have competing interpretability concerns driven by specific inference goals that have nothing to do with proofs about the model under assumptions.


>I’m saying the development of a model that yields scientific progress doesn’t have to have a connection to the parent comment’s proposed definition of interpretability

Let's say someone used a neural network to predict planetary positions as a function of time, mass and so on. They made a network with several hundred parameters, trained until it stopped converging and published their final state vector.

Would that ever lead to gravitational theory?


What do you mean by “gravitational theory” apart from “correctly accounts for predicting the gravitational effects that we observe”? If a neural net can do that, then yes, a perfectly “elegant” theory could be the giant enumeration of a big ugly bunch of parameters. Of course we might suspect that’s overfitting and whittle down to a compact description, and that’s great, but doesn’t make the ugly enumeration of a big model any less scientific.

Another point of view would be to say how does the Wiles proof of Fermat’s Last Theorem give us “understanding” (or substitute virtually any non-constructive existence proof, or proofs by contradiction, such as for the Halting Problem).

Compact math descriptions are good, don’t get me wrong. When the mechanism of some data generating process is actually oriented in such a way that we can get a concise math description then we absolutely should. But that is largely unrelated to advancing understanding or scientific progress, which can perfectly be some ugly enumerated garble of parameters, or some extremely piecewise solution space. Not every data generating process is required to admit a concise mathematical description.


>What do you mean by “gravitational theory” apart from “correctly accounts for predicting the gravitational effects that we observe”?

F = G (m1 * m2) / r^2

Newton's theory of gravitation identified the relevant variables and their relationships. It had no junk variables or parameters. You can do algebraic transformations on it and use the formula outside of the original context. It was not fully correct, but it made exact predictions. People were able to notice the discrepancies it had with orbits of some celestial bodies, which helped in identifying general theory of relativity.

All of this would have been lost of the formula was a state vector of a large neural network.


> “Newton's theory of gravitation identified the relevant variables and their relationships.”

No, it introduced a useful low-dimensional fiction that approximately works, with high accuracy at coarse scales, and doesn’t account for relativistic or quantum effects (and so, in an absolute sense, is only an approximation, not an interpretation of what is real).

Besides, what you glibly type out as a few characters of a formula unpackages into a deeply hard-to-explain theory in natural language. It’s vastly less interpretable than simpler (and less correct) models because it introduces much more complexity of the meaning and physical concepts behind the formula terms. Just as quantum or relativistic extensions add even more complexity and counter-intuitive concepts and become even harder to interpret.

> “All of this would have been lost of the formula was a state vector of a large neural network.”

Why? I see no reason why that would be true. If there is a very low-dimensional manifold in parameter space that accounts for all the predictive power, that would be a very easy thing to diagnose from a neural network model. Lots of things have been discovered that way, like graph centrality metrics coming from eigen decompositions, style transfer, glove vectors, eigenfaces.

Getting the high-dimensional representation that yields accurate predictions is often step 1, great scientific progress and often a huge leap forward. Then step 2 is often asking if there is some natural low-dimensional compression of the parameters that is approximately as good, and if so, does it correspond to any known quantities.

Though sometimes this isn’t possible or even desirable, such as with non-parametric models, and yet such models can still represent “interpretable” scientific progress in the sense of allowing us to predict how states of affairs will develop when we previously couldn’t and solving inference goals.


Science (and especially mathematics) is all about increasing human understanding. A neural network that makes correct predictions would be very interesting to both engineers and scientists.

From a scientific perspective, the fact something can be modeled well using a neural network would provide interesting constraints on the system, which may eventually yield breakthroughs in understanding the original system. However, the neural network's parameterized equation by itself would never be accepted as good science, since its complexity would hardly contribute anything to our understanding.


> If there is a very low-dimensional manifold in parameter space that accounts for all the predictive power, that would be a very easy thing to diagnose from a neural network model.

People wouldn't waste time researching neural network visualization tools, coming up with adversarial examples and developing alternative (more explainable) machine learning models if that was the case.


Along the same lines, convergence is mentioned as a candidate for interpretability but convexity is not mentioned even once.


This is a very interesting notion, could you elaborate why convexity would be a candidate? Convexity defining a class is clear, but as a proxy for interpretability I have not heard this before. I would assume this would need some measure of degree of convexity - different than strong > strict > convex


You may prefer this link: https://arxiv.org/abs/1606.03490


extra points for linking to abstract, not pdf :)


I think interpretability is much more correlated with model size than model type.

Small neural net is much more interpretable than decision trees with thousands of nodes.


I challenge that last statement. Have there been any neural nets that actually have a solid interpretability? Usually those are more on the lines of effectiveness in training and validation data. With no real clue as to what the driving features were.


https://distill.pub/2018/building-blocks/ This seems pretty good to me. And the nets are not exactly as small as I meant.

I work with random forests and build forest which have more than 80000 nodes per tree. Other than some basic computation of feature importance, it is a black box on the same scale as modern neural nets, maybe even worse.


Thanks for the link. Will take me some time to digest it. Last time I looked at one of these, it was less informative than I'd care to admit. I want to know why a classifier found a vase. The answer is typically some form of "because it was able to see the vase."

Would be more interesting to see a classifier that predicts something is going to happen. A predictor that can predict a person is about to step off of a curb, for example. Is it pose of the person? Did it require seeing multiple frames of the person, such that it was an inertia preditor?

So, yes, comparing things to gigantic trees can make things tough. But I thought that was the beauty of boosting and building up smaller trees. Most of them are usually more interpretable than you might expect.


> So, yes, comparing things to gigantic trees can make things tough. But I thought that was the beauty of boosting and building up smaller trees. Most of them are usually more interpretable than you might expect.

Regarding your last paragraph, I found this paper https://arxiv.org/pdf/1504.07676.pdf worth reading. Excerpt from the abstract: "We conclude that boosting should be used like random forests: with large decision trees and without direct regularization or early stopping."


> Small neural net is much more interpretable than decision trees with thousands of nodes.

Yes. Example a simple logistic regression model


An alternative proposal put out by NYU's AI Now think tank, is to require a kind of "environmental impact statement" for any black box used in government or public sector applications. Instead of rolling out an experimental Predictive Policing agent en masses for example. A "sandbox" is created first. And actual humans in the loop judge if it is prone to bias, unfairness or harm.

Public accountability is designed into the algorithm at the outset.

https://ainowinstitute.org/reports.html


I thought this was a good post!

In fact, I had just read a CVPR '18 paper that did the kind of thing he mentioned--presented an 'interpretable neural network' that assumed what interpretability meant...


Anyone have other good references for this topic?


Interpretable ML [1] is a small online book on interpretable methods. Even if the content itself is rather shallow, it has links to a lot of more focused papers on the topic. Towards A Rigorous Science of Interpretable Machine Learning is one of the most thorough papers on interpretable ML that I have come across. The main author, Finale Doshi-Velez, has a done a lot of interesting work on interpretability [3].

[1] https://christophm.github.io/interpretable-ml-book/

[2] https://arxiv.org/abs/1702.08608

[3] https://finale.seas.harvard.edu/publications


Here is a great caltech lecture about why "simple" models have better out of sample performance. Jump to 28:00, if you want to get straight to the point.

https://www.youtube.com/watch?v=EZBUDG12Nr0&list=PLD63A284B7...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: