Interpretable Machine Learning

lalaland1125 · on July 11, 2019

"Interpretability" is probably one of the most misunderstood topics in ML. The key with interpretability is that it's actually an Human Computer Interaction (HCI) problem, not a statistics or math problem.

The problem is that people have particular tasks that they need to accomplish and the goal of interpretability techniques is to help them solve those tasks. This is somewhat similar thematically to how a keyboard layout can help someone achieve their typing speed goals. The two main tasks that interpretability is usually geared towards is model error detection and hypothesis generation. In model error detection, the goal is that a human wants to be able to better evaluate (improve their accuracy) at detecting whether or not a model is "incorrect" or not. Hypothesis generation is more about the generation of testable causal hypothesis that can them be manipulated to manipulate the outcome.

The real trouble in the interpretability field is that very few people actually evaluate on those end tasks. The only paper I know of that did such as evaluation had a negative result (https://arxiv.org/abs/1802.07810) in that the "interpretability" did not improve the ability of people to detect incorrect model predictions. As it stands, there is currently no reason to believe that any interpretability techniques actually work in helping people achieve the tasks they care about.

Der_Einzige · on July 11, 2019

This is plain wrong.

https://arxiv.org/abs/1602.04938

A simple counter-example. A really neat library called "eli5" and specifically, it's white-boxing functionality for NLP and CV applications does a great job of helping one figure out if their model is learning "incorrect" information in its prediction pipeline. The paper I link show why this does a good job of improving user trust in models, which is your second point on human-computer interaction

https://eli5.readthedocs.io/en/latest/tutorials/black-box-te...

For instance, one can run the same TextExplainer algorithm on a word embedding powered model, which encodes its features in some N dimensional space that doesn't correspond to anything that a human understands directly. TextExplainer can reverse engineer which words were most important for a classifier predicting x sample in y class, when this would otherwise be difficult to do (as is the case for basically all state of the art NLP models)

All you need to do is comb through your models erroneous predictions with Textexplainer and you can figure out what "incorrect" information is being learned. I found out this way that punctuation is pretty bad for generalizable NLP, and should be preprocessed out in most circumstances. I can take the really simple vizualizations and show them to managers and suits to justify why a model was wrong somewhere. This is value gained from interpretability

This absolutely is an interpretability technique which actually helps people achieve a task they care about. The only cavet of this method is that the white-box model which immitates the black-box model isn't 100% accurate, but its accuracy is a hyper-paramater.

PS. If you look into who the developers of eli5 are - it's an agency which takes money from DARPA and works with sigint agencies. Eli5 and text-explainer can work on any type of black-box model. Spooks want to crack open the models. That should be your signal that there's valuable information to be gained from interpretable machine learning.

SiempreViernes · on July 11, 2019

As I understand it, DARPA funds a lot of things and they aren't that stingy with money either, so I wouldn't view that as some sort of golden indicator of importance.

For instance, the DARPA grand challenge is a impressive source of footage of robots falling over for no apparent reason...

gajeam · on July 10, 2019

I was doing academic research on this exact topic and Chris' is the only book that I found that breaks down the techniques one at a time this way. This book is nothing short of brilliant.

bonniemuffin · on July 11, 2019

Agreed, this is really a lovely, helpful book that's clear and easy to read, and introduces a lot of useful ideas in enough detail to decide which one to use in a given situation.

mlevental · on July 10, 2019

brilliant? really? who didn't already know that linear models and random forests were interpretable and that you can look at feature layers in nets?

I'm not usually a naysayer but I feel like https://youtu.be/FSubdmYGVEI?t=105

anthony_doan · on July 11, 2019

To be fair linear models are statistical models and many of those are explainable.

Random Forest is not explainable per say... A decision tree is explainable but an ensemble of it is not to say explainable in at least statistical models sense.

In linear models the coefficients give you a linear numerical sense and also linear associations where as Random Forest give you OOB and feature importance which isn't as clear cut. It also really dependent on the quality of data and most machine learning that aren't statistical models depend heavily on a large amount of data to overcome the model's weakness (random forest being selection bias).

Statistic, to toot my horn a little, have very vast breath and depth in inferences/explainable with huge literatures on it including different fields (e.g. econometrics with time series, biostatistic with longitudinal, survival analysis, etc..). There are also techniques on building parsimonious models too like logistic regression with purposeful selection by Dr. Lemeshow and Dr. Hosmer. And different ways to build inference models versus predictive/forecast model.

I think many people know a general idea but not in depth because I mean there's a field with many deep rabbit holes.

BayezLyfe · on July 10, 2019

Nice perspective in "Interpreting AI Is More Than Black And White": https://www.forbes.com/sites/alexanderlavin/2019/06/17/beyon...

r0f1 · on July 10, 2019

Nice book, and already included in this awesome list: https://github.com/r0f1/datascience.

0xd171 · on July 11, 2019

Something that is often left out when talking about interpretability is the relationships between the predictors and the dependent variable in simulations. For example:

- I fit a complex, difficult to interpret model to a dataset, attempting for forecast my sales (structure of the dataset largely irrelevant for this example)

- I take an entry from the training set and decrease the value of some price attribute by 15%, leaving everything else unchanged

- I try to predict the sales for the entry I just created using the trained model

- What happens if the model now predicts lower sales? There is a clear relationship between price and sales volume going in the opposite direction. Would lowering my prices by 15% really lead to a decrease in my sales? How do you track what's happening in the model to create this forecast? Did I use the wrong model? Was my training data incorrect? How do you explain this to a client or to a product user?

ivanech · on July 10, 2019

Wow, this is great - I wish I'd had this book a year ago. I actually ended up making ICE plots (book chapter: https://christophm.github.io/interpretable-ml-book/ice.html) without knowing this was something people did.

0815test · on July 11, 2019

> Interpretable Machine Learning

...is called statistics. At least, truly interpretable machine learning is. The whole point of that 'machine learning' (or god forbid, 'data science') moniker, as opposed to 'computational statistics', is a way of saying "we have no idea what we're doing!" It's magic!

0xd171 · on July 11, 2019

Not really. I get the dislike for the hyped up terms but there are many companies that work in ML/data science (and hire ML engineers or data scientists) that do know what they're doing. And there are distinctions, though sometimes subtle, between the terms.

why-el · on July 11, 2019

Sure, I see what you mean, but he is trying to help people who are navigating "machine learning", whatever that is. Your argument is the same as "well, why did they call it gravity, shouldn't they have called it attraction?" It is what it is, and you teach based on the term already in the literature.

laichzeit0 · on July 11, 2019

It's not really "magic". It's just that interpretation is not really that important sometimes. It's the approach of "Who cares what these coefficients mean, the model performs well on the test set". This is true for most of predictive modelling.

sdl · on July 11, 2019

I think the paper "Explaining Explanations in AI" by Brent Mittelstadt (https://arxiv.org/abs/1811.01439v1) gives a really good overview of the different goals and approaches.

wespiser_2018 · on July 11, 2019

I cannot stress how important this stuff is for machine learning in production systems. If your ML problem interacts with a client, interpretation matters, and this book gives you a good collection of tools to get started with!

FPGAhacker · on July 11, 2019

Tangential, but I like the website. Has a similar look and feel to the Rust book. Same backend?

this_is_not_you · on July 11, 2019

GitBook. In this case it uses an R package to render it.

https://github.com/christophM/interpretable-ml-book#renderin...

maddy1512 · on July 11, 2019

Nice, except for the first short story about medical pumps! It was quite weird. :D

iseahound · on July 10, 2019

The "Interpretability" part is really just philosophy.

avmich · on July 10, 2019

Philosophy, at least the real one, isn't "just". It's almost by definition deepest and widest study of what's possible to study, with all consequences and offsprings like sciences.

chroem- · on July 10, 2019

Considering the issues posed by adversarial examples, it's really not.