Best Practices for Applying Deep Learning to Novel Applications

AndrewKemendo · on April 9, 2017

It's good general advice, but frankly I think it doesn't address some of the major pitfalls - namely the top one: personnel.

In the very beginning it's stated:

"In this report I assume you are (or have access to) a subject matter expert for your application."

In my experience this is where it goes off the rails for most of the crowd that she is addressing. Not because they don't have someone, but because who they have isn't really a "subject matter expert."

It's a muddy term anyway especially in the field of Machine Learning. Excluding for a moment the huksters and bold faced liars, within ML there is WIDE variance in competence, domain specificity and application specific capability within the field.

The biggest capability gap that I have encountered when working with fantastic ML folks is that the ones that understand the mechanisms/algorithms/approaches best, are actually pretty terrible at delivering production code. That's not for lack of capability, it's simply because the bulk of their time has been spent in research - so they approach things very differently than application focused engineers. This is extremely relevant in this case because this is an application specific paper.

There are a plethora of mine-fields in applications of ML, some of which are outlined here from a systems approach, but the majority of which are personnel issues in my experience, and "culture" issues - not to be confused with "culture fit" problems that exist elsewhere.

lnsmith · on April 9, 2017

I apologize for the confusion from my report. I intended it for people expert in their field with very little ML or DL expertise.

I am also guilty of focusing on my research but I am called upon frequently by others in DoD who are interested in applying DL to their problem. My report was an outgrowth of my consulting with these people who were interested in leveraging my expertise. I agree with jph00 who states he has seen non-ML experts successfully apply ML and get good results. So I wrote this report to give general/universal advice, even though it won't always apply and won't solve all their problems.

Leslie N. Smith

sndean · on April 9, 2017

> It's good general advice, but frankly I think it doesn't address some of the major pitfalls - namely the top one: personnel.

Just for some context [0]: Leslie works at a facility where there's (generally) no shortage of subject matter experts. Recently, within the last ~6 months, he's been pulled into at least a few groups where lots of biologist, chemist, and others who essentially know nothing about DL are trying to get projects off the ground. The shortage he's experiencing is probably on the ML/DL side.

[0] I work with Leslie and have received his advice on projects.

AndrewKemendo · on April 9, 2017

In fact the USG broadly, specifically the DoD is taking a huge step forward to try and build ML into everything.

I was invited to talk with the Defense Innovation Board small group this last Tuesday with SecDef Mattis, Eric Schmidt et al. and it's actually becoming a HUGE deal, one that nobody inside the USG is prepared to address, so I'm not surprised Leslie is bombarded with these requests.

I'll go back to the point though that the USG is critically short of competent ML personnel.

sndean · on April 9, 2017

> I'll go back to the point though that the USG is critically short of competent ML personnel.

Definitely agree with that. And from what I've heard they're having a terrible time trying to hire. Especially where we're at, where a PhD required, the USG won't be able to compete on pay (?).

AndrewKemendo · on April 9, 2017

It's a mix between these three things:

1. Being ethically opposed to working for the DoD

2. Terrible (relative) wages

3. Working environment dominated by bureaucracy

Some seriously big obstacles considering the options.

bagrow · on April 9, 2017

I think the author means a non-ML expert interest in using ML. For example a geologist who wants to apply ANNs her field data

AndrewKemendo · on April 9, 2017

Well that's my point. A non-ML expert (your geologist) would need to find a SME to apply ML to their problem. The author seems to distinguish between these two roles rather than pushing the non-ML person into becoming a ML SME.

Said geologist, would likely run into the problems I describe above if looking for that ML SME.

mindcrime · on April 9, 2017

I think everybody is kinda talking around each other here.

In this context, the geologist is the SME. The "subject matter" is the geological application, not Machine Learning. The whole point of this document is advice for people who are SME's in some domain, but not ML experts, but who are working on applying ML in their domain.

This report is targeted to groups who are subject matter experts in their application but deep learning novices.

Of course it goes without saying that, in a ideal world, you'd have both domain SME's and ML experts working on the problem together, but given that we don't live in an ideal world, the point here - at least as I read it - is to help the non-ML-expert folks get started applying Deep Learning.

And for all the stuff we can say about the difficulty of applying DL and the need for experts, the reality is that you don't actually always need a "deep learning expert". Let's say your problem just happens to be similar to, say, handwritten digit recognition. Given that anybody can download DL4J, go through the tutorials, and get a network going in an hour or two that gives something like 98%+ accuracy, there's a good chance that our "geologist who's a DL novice" can create a network that will yield useful results.

Maybe getting the last couple of percentage points of accuracy out will require a "real" DL expert, but hell, depending on the scenario, that might not even be needed.

AndrewKemendo · on April 9, 2017

I see what you're saying and upon re-reading I think it's ambiguous which SME is being referenced. I guess it depends on who the audience is, and I think we both assume the audience is a non-ML SME.

the point here - at least as I read it - is to help the non-ML-expert folks get started applying Deep Learning.

If that is in fact the case, I would argue that this document is not really giving that user the best starting point - though they get points for trying.

As a practitioner perhaps I am biased, and our applications are on the boundaries of solved problems (though include some reasonably solved CV solutions) - however I would argue that unless your application is strictly the simplest and can use off the shelf solutions, for example simply implementing Google Cloud Vision API or the Microsoft Cognitive Computing API - you're going to need someone with years of study/practice with ML to get to a good outcome.

zodiac · on April 9, 2017

It is on the face of it a pretty weird phrase ("subject matter"), but I've always seen it used to mean what mindcrime means. The phrase is also used in general non-ML specific programming (eg among a team of people building a mobile app for a bank, the "subject matter expert" might be a non-programmer who has years of experience working for banks).

It's also disambiguated by the author saying "subject matter expert for your application" (application being an application of machine learning, eg to radiology, geology...), and saying "subject matter experts in their application but deep learning novices" in the paper's abstract.

jph00 · on April 9, 2017

I've seen dozens of examples of people with less than a year of ML experience get great outcomes from deep learning. It's no longer true that this is such an exclusive field.

gwern · on April 8, 2017

Looks like good advice to me. Like a more DL-focused version of "A Few Useful Things to Know about Machine Learning" http://www.datascienceassn.org/sites/default/files/A%20Few%2... , Domingos 2012

comicjk · on April 8, 2017

> Let’s say you want to improve on a complex process where the physics is highly approximated (i.e., a “spherical cow” situation); you have a choice to input the data into a deep network that will (hopefully) output the desired result or you can train the network to find the correction in the approximate result. The latter method will almost certainly outperform the former.

This aligns with my experience (computational chem PhD). When applying a strong, general-purpose mathematical patch to an existing model, use as much of the existing model as possible. Otherwise the patch will have a hard time fitting, and maybe be worse than what you started with. Philosophically, this also comports with my thinking (it's the modeling equivalent of Chesterton's Fence https://en.wikipedia.org/wiki/Wikipedia:Chesterton%27s_fence).

DrNuke · on April 8, 2017

Physics is deterministic in states and always tends towards an equilibrium, so novel results from DL may still fit some continuum math model without being stable or even real. Domain expertise, on the other hand, helps prepare data for ML algos in such a way that results will come (or not) within the boundaries of reality and hopefully stability. I am trying both approaches for some materials science goals of mine and am curious to see what happens, now that powerful hardware is cheap enough on the cloud to put some ideas to work. Side point is all this was just impossible five years ago, so I am grateful and excited to have this opportunity.

bluetwo · on April 9, 2017

Was kind of hoping for some examples of novel applications no on has thought of. :-)

lngnmn · on April 9, 2017

Surprisingly good and sane paper, without all that hipster's bullshit.

The emphasis on the quality of the training data and, most importantly, on the evaluation and careful choice of heuristics on which the model to be build upon, is what makes the paper sane.

There is no shortage of disconnected from reality models based on dogmas, while, it seems, there is acute shortage of the models properly reflecting some particular aspects of reality.

Data and proper, reality-supported heuristics (domain knowledge) are the main factors of success. Technical details and particular frameworks are of the least importance.

This, BTW, is why it is almost impossible to compete with megacorps - they have the data (all your social networks) and they have the resources, including domain experts, without whom designing and evaluating a model is a hopeless task.