Machine Learning Crash Course

minimaxir · on March 1, 2018

Looking through the topics covered, the standard AI-course caveats (https://news.ycombinator.com/item?id=16247629) apply.

Yes, AI/ML MOOCs teach the corresponding tools well, and the creation of new tools like Keras make the field much more accessible. The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

However, contrary to the thought pieces that tend to pop up, taking and passing a crash course doesn't mean you'll be an expert in the field (and this applies for most MOOCs, honestly). They're very good for learning an overview of the technology, but nothing beats applying the tools on a real-world, noisy dataset, and solving the inevitable little problems that crop up during the process.

Reviewing the Keras documentation (https://keras.io) and examples (https://github.com/keras-team/keras/tree/master/examples) are honestly much better teachers of AI/ML than any MOOC, in my opinion.

(Of course, Keras is now a part of TensorFlow, so there's a neat Google vertical intergration with this crash course!)

achompas · on March 1, 2018

> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

It is absolutely true that you do not need a graduate degree to apply AI/ML to vanilla problems.

It is also absolutely true, in my experience, that you need a graduate-level education or years of hands-on experience to troubleshoot cases where AI/ML fails on a deceptively-simple problem, or to tweak an AI/ML algorithm (or develop a new one) so it can solve a novel problem.

That said, I think these MOOCs are good enough to get someone to a place where they can create nice /r/dataisbeautiful-style visualizations, or pair with a senior-level DS to deliver something.

(Edited to add folks who have worked on problems for years and add a final note.)

gchadwick · on March 2, 2018

> It is also absolutely true, in my experience, that you need a graduate-level education or years of hands-on experience to troubleshoot cases where AI/ML fails on a deceptively-simple problem, or to tweak an AI/ML algorithm (or develop a new one) so it can solve a novel problem.

How much of that is critical domain specific knowledge and how much of that is just general engineering debugging/problem solving experience though? Certainly the person who does have the masters/PhD and a few years of applying that to real-world ML problems will have the edge but an experienced developer who's got a knack for maths (though no direct ML experience) may be able to get up to speed quicker than you think. Part of that will be experience with knowing how and when to ask the right questions when you get stuck.

achompas · on March 2, 2018

> How much of that is critical domain specific knowledge and how much of that is just general engineering debugging/problem solving experience though?

It's both, right? You pick up problem-solving techniques as a researcher or engineer; as the former, those techniques lean towards scientific problems. Your average engineer doesn't need to know about contrasting.

Again: it's possible to learn the necessary math in your spare time! I agree!! However, it's far easier to do it in a graduate program as a full-time job for 2-5+ years.

infinite8s · on March 2, 2018

The knack for maths is the important bit.

sannee · on March 3, 2018

The math necessary for ML/AI (statistics/vector calculus) is mostly taught at undergrad level though isn't it? So most engineers should already have it covered.

bluetwo · on March 1, 2018

I can't help but think in 3-5 years how quaint our tools of the day will seem.

achompas · on March 1, 2018

I think about this constantly.

Not to sound like I walked uphill in both directions back in my day or something, but I remember building models in numpy without pandas. It was tedious -- and that's just a nice API wrapping ndarrays!

bhrgunatha · on March 2, 2018

> Not to sound like I walked uphill in both directions back in my day

Local minima?

YeGoblynQueenne · on March 2, 2018

Most likely, gradient descent with momentum.

farazbabar · on March 2, 2018

Oh boy, that and perturbation.

jimbokun · on March 2, 2018

I’m not so sure.

You can make an argument current tools haven’t really surpassed a Lisp Machine for developer productivity, or a SmallTalk environment.

noobermin · on March 2, 2018

Not really. I see things like leftpad and npm fails and CEOs mailing private keys they've stored from customers. I see the same lessons we have have to re-learn year after year.

lugg · on March 1, 2018

What's an example of a problem that needs that troubleshooting? (Curious)

achompas · on March 1, 2018

Honestly? The exact problem I'm dealing with at work right now.

We're trying to re-write our recommender for artist music stations at iHeartRadio (aka "I'll listen to Drake or Kendrick Lamar's station at the gym today"). Just today, I tried adding negative sampling to the matrix I'm factorizing, hoping it encourages spread in the embeddings learned for artists in certain types of genres.

I have a MS, but not a lot of research experience. It would have taken me a while to find this solution on my own. However, the moment I described this problem to my manager - a PhD graduate with several years of research and industry experience - he immediately suggested negative sampling.

What I learned during my MS helped me grok the math immediately. We're adding noise to the training set and penalizing vectors lengths to avoid overfitting. Easy! Identifying a solution worth exploring? Not easy, at least without a degree or significant experience.

(There's also the chance I should know this, in which case I have some reading to do. ¯\_(ツ)_/¯)

kahnjw · on March 2, 2018

Aren't you sort of glossing over the fact that he is in high up machine learning position at a company that specializes in recommender systems? Doesn't that by itself increase the likelihood that he deeply understands implicit and explicit matrix factorization?

I am a good ways through my masters (second CS degree, first specializing in ML), and the more I learn, the more I realize that on any given topic, there is no guarantee the PhD in the room has the most expertise. Machine learning is a broad field that contains many subfields, methodologies, and many applications. It is a bit like computer systems or software engineering: nobody knows it all, people who are experts have intimate knowledge of a specific subset of the field. Of course, you can more around over time, but it takes years to build up expertise in even two or three subfields of machine learning.

Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.

achompas · on March 2, 2018

> Aren't you sort of glossing over the fact that he is in high up machine learning position at a company that specializes in recommender systems? Doesn't that by itself increase the likelihood that he deeply understands implicit and explicit matrix factorization?

Sure thing, but someone in that position needs years of experience in recommender systems, as well as working with researchers.

Folks are hanging on to the PhD part of my claim, instead of the "PhD or experience" part. The fact is, a PhD + prior industry work means the person is close to a decade of relevant background, grad degree or not. They will unstick a co-worker far faster than an experienced backend developer with, say, a year of Keras experience.

> Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.

Seems like it! Email me if you'd like to chat some more offline (it's in my profile).

Retric · on March 2, 2018

That has little to do with a PhD, it's the kind of thing you get with experience leading to a deeper understanding.

3D programming started as a field where only PHD's had any deep understanding of what was going on simply because they had experience when nobody else did. You see this pattern repeated frequently, in any complex domain.

achompas · on March 2, 2018

Yeah, I expected this reply.

The PhD is sufficient but not necessary here, right? A PhD researcher's job description is basically "learn necessary math, become a domain expert, and publish papers advancing that domain." It's difficult (but possible) to gain the same experience in industry if you don't have a graduate degree. Which company would pay you to work through Bishop or Goodfellow for a few months? Even a principal DS doesn't get that deal, much less a junior/associate.

Also remember: my comment addressed non-vanilla cases. In your example, this is the difference between a researcher advancing 3D programming and someone using Unity or Unreal.

(Also, sorry for all the edits. Done now!)

Retric · on March 2, 2018

I would say PHD is sufficient to advance the field. That's no small thing, but only really overlaps at the start when just about anything advances the field and you need a broad focus.

Machine leaning for sorting peas at high speed is a very well trodden area at this point with a lot of industry specific domain knowledge. I expect self driving cars for example to reach a similar state in ~10-25 years.

The risk with a PHD is you miss the specific wave. But, if you want to stay on the bleeding edge it's probably well worth it.

achompas · on March 2, 2018

> I would say PHD is sufficient to advance the field.

Yep! We’ve now made our way back to my initial point in response to OP. :)

recursivelambda · on March 2, 2018

You can spend many months working through papers and books without a company paying you for that. That's something that I continually do and have always done, in my own time (and many different fields). Sufficient and not necessary indeed.

infinite8s · on March 2, 2018

It's definitely easier to do when it's your primary job.

aoki · on March 2, 2018

ali rahimi alludes to the problem of google engineers simply needing to tweak models that were previously tuned by google researchers who do have well-developed intuition [0]. because the intuitions in explicit form are at best heuristic and not necessarily even consistent, signing up to improve a model without them might result in spending indefinite time and compute resources without guarantee of positive results. which is a terrible perf-theoretic strategy...

[0] http://www.argmin.net/2018/01/25/optics/

andbberger · on March 2, 2018

Model divergence, nonsense predictions. The whole black art of ML (specifically neural nets) is coaxing them into working.

If you take some sophisticated deep neural net and try to train it on a binary classification where tails occurs 99% of the time - unless you specifically take measures to correct for this bias - the net will just learn to predict tails.

RealCasually · on March 1, 2018

Fairness and fighting adversarial examples come to mind.

3pt14159 · on March 1, 2018

Unless you work for a company obviously known for their ML the "expertise" out there right now is brutal. People are building recommendation engines without knowing the very, very, very basics like Jaccard indexes, ROC Curves, or topic drift. I've even had to explain type two error to someone working on one of these before.

I agree with your general thrust, and you're right, messy data is often 95% of the problem, but even going through just the Google courses will put people in the top 15% in most cities.

BadCookie · on March 1, 2018

I took a machine learning graduate-level course from Andrew Ng himself, and I don't recall learning about Jaccard indexes or topic drift. Maybe your sense of what counts as "very, very, very basic" is skewed toward your own experience. There's a phenomenon known to psychologists where people tend to think that the stuff that they know is very easy and basic, so they conclude that anybody who doesn't know what they know must be uneducated. But then it turns out that the person you think is uneducated knows about a bunch of surprising stuff that you don't. I can't remember the term for this phenomenon, but I often remember it whenever I find myself beginning to judge another person's expertise. This phenomenon is also super relevant to the failings of most technical interviews, in my opinion.

eanzenberg · on March 1, 2018

There's a bit of snobbiness in different areas of tech, although there also are in different areas of academia and research. At the end of the day, the most successful people are the ones who wouldn't dismiss a DS who didn't know "Jaccard index" or "the Halting Problem".

aaron987 · on March 3, 2018

Are you referring to the Curse of Knowledge?

https://en.wikipedia.org/wiki/Curse_of_knowledge

What you are describing also sounds a little like the Dunning-Kruger effect:

https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

mixmastamyk · on March 2, 2018

Oh, wish I knew the name of that phenomenon as well.

eanzenberg · on March 1, 2018

You probably can't communicate effectively. If you are describing "Type two error" of course you will get eyes glossing over. A huge problem with research fields is their terse banal labels. Confusion matrix anyone?

ardit33 · on March 1, 2018

Or you can just say "false negative", and every CS major will understand you.

I find people in Math and CS have often very different names for the same type of concepts and they could easy understand each other if they stuck to the more common terms.

In this case, saying: TYPE 2 ERROR, makes you look like you are trying too hard.

eanzenberg · on March 1, 2018

It's also extremely confusing because very few people remember type 1 vs 2 but false positive/negative has intuitive understanding.

aoki · on March 1, 2018

type ii error is statistics, not mathematics. there is no equivalent concept in CS because type ii error relates specifically to statistical inference and hypothesis testing.

that said, if you are just pointing to a box in a confusion matrix and saying "TYPE II ERROR," you are probably trying too hard.

eachro · on March 2, 2018

Eh, but if you've taken a machine learning course, you should have seen the notion of false positive/false negative when you cover any kind of classification technique.

aoki · on March 2, 2018

but they're not actually equivalent, in spite of tables like this [0]. type ii error is a false negative result in the context of a test, where you have to understand which hypothesis is which and exactly what you are accepting or rejecting (hypotheses are not always as simple as hotdog/not-hotdog); if your listener doesn't know what statistical tests mean or wasn't following the setup, they have to stop you and ask.

[0] https://en.wikipedia.org/wiki/Type_I_and_type_II_errors#Tabl...

minimaxir · on March 1, 2018

Granted, Type II error and confusion matrices are covered in more basic statistical classes, and are indeed important for hypothesis testing.

maffydub · on March 1, 2018

I think the point the parent might have been making is that many people (or maybe just me) know "type II error" by the far more self-explanatory name of "false negative".

blackrock · on March 1, 2018

What's a "type two" error?

I had to google it. It's a false negative.

A "Type 1" error, is a false positive.

Is this like how people overuse the term "orthogonal"?

gmac · on March 1, 2018

"Type I" and "Type II" errors are some of the stupidest and most obfuscatory academic terminology ever invented, and (as an academic) I absolutely refuse to make the effort to learn which way round they go. Just call the bloody things what they are: false positives and false negatives. (Getting seriously OT now, but Kahneman does something annoyingly similar with his talk of "System 1" and "System 2" in Thinking Fast and Slow).

jerf · on March 1, 2018

As a computer scientist/programmer, there are three numbers, 0, 1, and infinity. If you're going to index your errors by the natural numbers, and you've got Type 1 Errors and Type 2 Errors, my next question is what a Type 3 Error is, and you know what my next question after that is.

Otherwise, please take this wisdom from programmers, who deal with this sort of thing all the time, and use an enumeration, in this case, {False Positive, False Negative} will do just fine.

derivt · on March 2, 2018

When you are designing an hypothesis test the term positive and negative are not so clear. For example you can test that mean weight of bags is greater than 5.0 kg or smaller than 5.0 kg, both test are different, and some times you can accept both greater and smaller than 5.0 kg. The philosophy of hypotesis test is not as clear as a standard tests for pregnancy. In other terms,in some cases the H0 hypothesis is symmetric (>= versus =<) and is not clear what a positive result should be, you have to state clearly what is the H0 hypothesis. In a pregnancy test everyone agree than H0 is that you are not pregnant, that is in my HO the semantic difference between Type 1 error and false negative.

fibores · on March 1, 2018

What are some synonyms to Kahneman's System 1 and System 2 then? Because Type 1 and 2 errors seem to be completely equivalent to false and positive negatives. I think Kahneman motivates his decision to introduce the terms System 1 and 2 quite well in his book, and I don't know of any direct counterparts.

shard · on March 1, 2018

Jonathan Haidt proposed a similar system in his book "The Happiness Hypothesis". He called it the automatic and controlled sides. The automatic side/system 1 is also what's being described in the book "The Inner Game of Tennis". I would summarize the two sides as the reflexive and the deliberate sides.

Tarq0n · on March 2, 2018

If the title of his book is justified, maybe the fast and slow systems?

nmca · on March 2, 2018

Ahaha, I agree so much the the Kahneman jibe. How could someone so smart pick names so f*cking dumb?!

blackrock · on March 1, 2018

So, to put this in human terms.

A false positive or false negative, can be like a pregnancy test.

A false positive, can be where the pregnancy test shows your wife is pregnant, but she is not. And the baby never arrives. Phew, dodged a bullet!

A false negative, can be where the pregnancy test shows your wife is not pregnant, but she really is. And 9 months later, a baby accidentally pops out. Oh crap!

joshuamorton · on March 1, 2018

Of course false-positives can also be bad. To use your example "you then spend $10,000 prepping for the baby, but it never comes, what a waste!"

bitL · on March 1, 2018

> Jaccard indexes

Funny you mention Jaccard; I was looking up if IoU (Intersection over Union) has any other name known to ML people when I was preparing my self-driving car presentation (IoU is used in semantic segmentation), and found out it is called Jaccard index as well. To my surprise, all ML experts I know knew about IoU but nobody about Jaccard. I guess it might depend on which university you attended?

aoki · on March 1, 2018

i think the term is more prominent on the NLP side, via information retrieval (IR) and clustering. i first saw it in IR, and you'd see it in stanford's CS 124 or CS 224N, for example. if the parent is talking about people who are working on a system that has text-understanding component, i can understand their surprise.

maffydub · on March 1, 2018

For anyone else who wondered what the Jaccard index is, it's also referred to as Intersection over Union.

...and if you haven't come across that either, see https://en.wikipedia.org/wiki/Jaccard_index for details.

nilkn · on March 2, 2018

Was the project the person was working on having any success? If so, you might be unfairly ignoring their positive contributions and focusing only on this one negative sign that you observed.

sacado2 · on March 5, 2018

My ML model is 99.7% sure you're a gatekeeper. Might be a type 1 error, though.

fwdpropaganda · on March 1, 2018

What's topic drift?

stochastic_monk · on March 1, 2018

It seems to be a specialized term referring to the change in focus of blogs [0] and online communities [1] over time. This strikes me as a very specialized concept, rather than a generally-important term in machine learning as a whole.

Edit: To add my perspective, with years of industry experience and graduate-level machine learning coursework, I have never before encountered this term.

[0]: https://link.springer.com/chapter/10.1007/978-3-319-16354-3_...

[1]: http://catb.org/jargon/html/T/topic-drift.html

epsilon_machine · on March 3, 2018

Also seems closely related to the concept of stationarity in time series analysis [https://en.wikipedia.org/wiki/Stationary_process]

ju-st · on March 1, 2018

maybe he meant concept drift? https://en.wikipedia.org/wiki/Concept_drift

3pt14159 · on March 2, 2018

I did, yes.

cowmoo728 · on March 1, 2018

We have to separate AI researcher and implementation engineer. These types of crash courses help get you to the point where you can reasonably work under PhD level people and write code to test, scale, and deploy their ideas.

For many current applications of ML this is acceptable because you're just stealing an idea from a paper or stealing ImageNet to recognize your problem. For anything else you really need to pay up and fight with Google for a real expert.

akmiller · on March 1, 2018

Exactly. Somewhat akin to graphics programming. There are much smaller groups that work on actually building 3D graphics engines, however, many developers take those engines and use them to build successful applications and games.

stealthcat · on March 3, 2018

So we need to wait for Unity of ML? With Asset Store selling models and datasets.

maxxxxx · on March 1, 2018

Will we reach a state where ML is as accessible for implementers as SQL databases? I still remember the time when databases were only for experts.

infinite8s · on March 2, 2018

Yes hopefully. Take a look at BayesDB (and the underlying crosscat algorithm) and probabilistic programming.

deviationblue · on March 2, 2018

>you can reasonably work under PhD level people and write code to test, scale, and deploy their ideas.

Which phd, though? All PhDs are not equal (see politics vs computer vision). Also, PhDs are hardly the holy grail of demonstrating capability, accuracy or intellect, especially given the reproducibility crisis, phds as a measure of any of those things should be used carefully.

jebediah · on March 2, 2018

They are talking about PhDs in Machine Learning of course

aje403 · on March 2, 2018

They really don't. There is a link to the Wikipedia page for matrix multiplication. If these are the people you want to hire, you might as well start outsourcing or generating random numbers

foobiekr · on March 1, 2018

Gate keeping is only obsolete when it ceases to have impact. The reality right now is that ML is extremely hard to enter even for a very knowledgeable and deeply experienced but non-credentialed (by degree) person.

It will be interesting to see how the situation evolves but my own observations are that people trying to enter the space might be better off getting a quickie masters if they can afford the time or cost than to try and bootstrap it.

eanzenberg · on March 1, 2018

Even people getting a quickie masters is hit/miss in my experience. At the end of the day, successful machine learning engineers require a whole suit of different skills, both technical, communicative, and even life skills that don't really exist for software devs. Not all those can be taught in 3 months, 2 years or even 6 years.

achompas · on March 1, 2018

> both technical, communicative, and even life skills that don't really exist for software devs

Not a fan of this "data scientist is a unicorn" style of thinking. The best people in any profession (especially software engineering) also use these skills in their day-to-day work.

xapata · on March 2, 2018

Data science isn't yet as stratified as software engineering, so there's less room for those without those "unicorn" skills. 10 years ago, there was no room at all. 10 years from now, there will probably be plenty of undergrads hired as junior data scientists.

ariwilson · on March 1, 2018

Life skills? Communicative skills? What?

eanzenberg · on March 1, 2018

IMO essentially ML experts don't work in a bubble and may interface with potentially anyone at a company; C-level, engineering, product, marketing, ops, etc etc. What other tech-employee needs that flexibility? So, I grouped communication / life skills into being able to understand, read, interpret and ultimately provide value to potentially any team. Just having the technical skills will only get you so far.

stevbov · on March 1, 2018

Isn't this part of what most software engineering degrees teach though? Particularly surrounding project planning and requirements gathering?

kevincennis · on March 1, 2018

I might be an outlier, but I interface with all of those on a daily basis in my role as a software engineer

ariwilson · on March 1, 2018

IMO software engineering experts (leaders) need to do the same thing.

canyon289 · on March 1, 2018

I agree with this comment. My experience has been that people don't really look at your resume unless you have machine learning experience on your resume or one of the stats type majors

jorgemf · on March 1, 2018

> "you can't use AI/ML unless you have a PhD/5 years research experience"

This is not true since a few years ago. But the fact that you can use it doesn't mean you understand what is happening and why it works in development but not in production. Everybody can copy a jupyter notebook and train a TensorFlow model in ImageNet. Now go to a new domain with very few information like 3D models and create a new network to be trained in that dataset. How many people that can train ImageNet can do the latter? Even inside deep learning experts in image classification fail in reinforcement learning domains and need a couple of years to be completely productive.

icc97 · on March 1, 2018

I fully agree with you that after a MOOC you've barely scratched the surface and until you're implementing them yourself then you're not going to jump into a ML job.

However personally I view the rest of the opposite way round. Getting through a course on Deep Learning takes months [0]. Then reading through Keras code once you understand the appropriate NNs is easy.

For example it takes a while of going through Neural Networks to understand ResNets. But if you understand ResNets then looking though Keras code that creates a ResNet [1] is easy.

If I want to build a NN of any sort in Keras I can just Google for it. However there's no simple Googling you can do to teach yourself NN in an easy to follow structured way.

[0]: https://www.deeplearning.ai/

[1]: https://github.com/Hyperparticle/one-pixel-attack-keras/blob...

p1esk · on March 1, 2018

Understanding NNs is easy. Understanding, collecting, and cleaning up data is the hard part.

Also, DL != ML.

Paraphrasing "The Tao of Network Protocols": If all you see is DL, you see nothing.

tonypace · on March 2, 2018

There are a tremendous number of people outside of programming who spend much or all of their work time collecting, cleaning up, and understanding data. Think teachers, accountants, traders - essentially everyone who spends a lot of time in spreadsheets.

icc97 · on March 1, 2018

The parent was referring to Keras which is a NN API hence why I responded talking about NN.

ariwilson · on March 1, 2018

Isn't this meant to be an introduction? I'm not sure who comes out of a crash course assuming they're an expert.

balls187 · on March 2, 2018

> taking and passing a crash course doesn't mean you'll be an expert in the field (and this applies for most MOOCs, honestly)

You're stating the painfully obvious here. I doubt anyone reading HN is under the impression that they'll be an expert after a single online course.

This is just a marketing stunt by Google to ensure their tooling is the defacto standard for AI/ML so that Google can dominate the AI/ML market they way they dominated Internet Search.

taneq · on March 2, 2018

> taking and passing a crash course doesn't mean you'll be an expert in the field (and this applies for most MOOCs, honestly)

Any field that you can become an expert in with a 6-week course or less is not a field that should be paying even high 5-figure salaries. Or, conversely, any field which pays 6-figure salaries is either not accessible via an MOOC, or is massively overinflated and probably in a bubble.

Puer · on March 2, 2018

The real barrier to entry to ML is statistics. Most computer science degrees require an intro to statistics class, but if you really want to understand ML and where it should and can be applied appropriately you need a much deeper understanding.

IMO, it's much easier to pick up the programming required for ML than the statistics. This was reflected in the classes I took as a double statistics/computer science major. Most of the people in my CS department's machine learning course were statistics students looking to go into data science, not computer programmers looking to get in on the ML trend.

ssivark · on March 1, 2018

> Yes, AI/ML MOOCs teach the corresponding tools well, and the creation of new tools like Keras make the field much more accessible. The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

The problem is that having a hammer makes one see everything as a nail. Sure, given a suitably clean set of images, anyone who's done a couple of tutorials will be able to apply a pre-trained neural on them to get something.

The hard part is getting an understanding of what tweaks to use when, and when to give up on a method. Otherwise, it is very easy to get carried away and waste time/resources.

For that, one needs to develop a good understanding of the landscape of ML algorithms, why each of them works and how they could break. That typically takes (intensive) experience or an understanding of the theory. Otherwise you'll be doing a brute-force search through a list of possible algorithms. As they say, "a few days in the lab might save a few hours in the library..."

Yes, things can get painful during hiring because the process is broken as it is, with additional complications due to not knowing how to vet for quality in a nascent field. But the "ML elite" are not morons and they don't mean to be obnoxious gatekeepers.

acconrad · on March 1, 2018

> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

So what are they hoping to achieve with this course? I'm genuinely asking because part of me wants to take the course, but another part of me feels like what's the point if, even through many additional courses to build up a skill set, Google wouldn't hire you as an ML engineer unless you basically start your career back to a junior engineer but in machine learning at another company.

solarkraft · on March 1, 2018

I have a feeling that they're trying to get people familiar with TensorFlow and thus very compatible with their cloud computing services.

eanzenberg · on March 1, 2018

I dunno.. for this level of ml scikit/numpy is way more accessible than tensorflow.

mdda · on March 2, 2018

Look at it from an interview perspective. If I ask "are you interested in exploring ML", and you're enthusiastic, my next questions are : What have you done? Have you taken any courses? GitHub? Blog Posts?

If the answer is that you're waiting for a special sign that it's worth doing before making an effort, then that really tells me that your enthusiasm for doing ML is not reality-based. Doing the ML thing is a pretty different mindset from other software jobs.

pilooch · on March 1, 2018

No but you'll ready yourself for the (geometric) programming of the next twenty years or so.

xamuel · on March 1, 2018

The other day I met with someone who was visiting my city to attend a big ML conference. In the course of our discussion, it transpired this person did not know the Halting Problem. He'd "heard of" Turing machines, but nothing more than "hearing" of them.

Gatekeepers shouldn't keep gates just for gatekeeping sake. But if so-called ML experts don't even know undergraduate computer science, that should really give you pause before you open up your wallet for them.

eanzenberg · on March 1, 2018

I could have the same reverse worldview:

"I attended a big software dev conference. Someone I met did not know about data bias. They heard of gradient boosting but nothing more than hearing them. If so-called dev experts don't even know undergraduate statistics, that should really give you pause before you open up your wallet for them."

HumanDrivenDev · on March 1, 2018

Maybe I'm an iconoclast, but I'd respect that person more for not trying to bullshit his way out of it.

xamuel · on March 2, 2018

That's a great point. It shines a positive light on the gentleman I spoke with, and a negative light on the industry as a whole (if bs is so rampant that merely admitting not knowing something makes someone shine)

joshuamorton · on March 1, 2018

Why does an ML-expert need to know the halting problem?

Considering that ML is really a CS-oriented form of statistics, why would you expect a statistician to know CS theory?

xamuel · on March 1, 2018

Thinking more, it's the misleading names ("machine learning", "AI") that rustle my jimmies so much.

Sure, you don't need to know the halting problem to approximately solve MNIST by fitting a million-parameter curve to a dataset.

But you're misleading people if you're claiming to have any kind of insight into how computers can be made intelligent, or how computers can "learn", when you don't even know the halting problem.

dannypgh · on March 2, 2018

I disagree. Frankly, for a lot of people and a lot of contexts, I don't think the halting problem is particularly important. You're using understanding of it as a shibboleth for exposure to common curricula about theoretical computation. But you can even know a lot about practical computation and not know anything about the halting problem. Curious: has your knowledge of the halting problem ever actually saved you time or effort in your work? If so, how?

Turing's work on the limitations of his machine are interesting, and I'm sure people with a deep understanding of them can advance the study of computation.

I think you're just being dismissive of skillsets which aren't your own. I think you're just bothered by the fact that AI and ML are being advanced more by people with more knowledge of linear algebra and statistics than computer science. And realize that it's the arrogant among them that will dismiss you as "just a technician."

Anyone who is looking down on either "scientists" or "technicians" should get over themselves.

sacado2 · on March 5, 2018

> Curious: has your knowledge of the halting problem ever actually saved you time or effort in your work? If so, how?

Not OP, but I'm working a lot with ontologies. Some ontologies representations are undecidable, while other languages are not very expressive but can be manipulated in polynomial time. Had I not known that, I would still be like "crap, why does it take so long? I must have a bug somewhere, maybe I should switch to C".

> AI and ML are being advanced more by people with more knowledge of linear algebra and statistics than computer science.

Just answered OP about that, but actually, symbolic AI is pure computer science. It does not get as much publicity as ML currently, but believe me, it's everywhere: at the core of almost all package managers, like debian's apt-get or maven, at the core of most advanced static code analyzers, etc.

sacado2 · on March 5, 2018

This is a recent shift lead by the ML trend. Traditionally (like 5 years ago), ML and AI were two different things, AI being the term for symbol manipulation. Expert systems, inference engines, constraint programming, SAT solving for instance. These domains are typical CS stuff: inference, complexity classes, low-level representation of data, etc. You don't need that much knowledge in math/statistics to be proficient in those fields, but you rather know what the halting problem is.

I'm working in the symbolic AI field, and sometimes use ML techniques. They are complementary. To me, ML is about induction, AI is abut deduction. They don't solve the same kinds of problems and they tend to work pretty well together.

xapata · on March 2, 2018

I guess "dynamic programming" must really bother you. That field was named completely arbitrarily, to secure funding.

The more you look around, the more you find science concepts are named for marketing purposes.

Heck, "data scientist" is a bit of nonsense.

joshuamorton · on March 1, 2018

Does this just come down to a semantic idea that if something isn't in pursuit of AGI, its not really AI? That feels unfair to most of these researchers who absolutely disagree with that.

And to consider these algorithms to not "learn" is similarly unfair. They do. They learn to solve specific problems (at least right now), but they do learn.

aoki · on March 2, 2018

would you not expect your hypothesized (theoretical) ML expert to understand boosting, which is generally explained in terms of PAC learning, which draws on computational complexity?

that said, i'd also expect a phd in statistics to be able to figure out boosting without taking an undergrad course that worked up from automata. so the halting problem test, while it does capture something, may not be quite right.

eanzenberg · on March 1, 2018

Why? Most of that cruft is abstracted away, computation only gets cheaper over time (a world class AI rig cost ~30k, a decent one for 2k) and most applications of ML run on commodity hardware.

xamuel · on March 1, 2018

For one thing, it suggests that they are actually technicians, not the scientists they're selling themselves as.

That's fine if you want a technician (and if they're charging technician's rates).

opencl · on March 2, 2018

I think when it comes to ML the CS experts with limited statistics knowledge are the technicians and the statistics experts with limited CS knowledge are the scientists, not the other way around.

eanzenberg · on March 1, 2018

And then that technician rate is X times what a technician rate would be for pure software dev.. what is your point?

sannee · on March 1, 2018

> But if so-called ML experts don't even know undergraduate computer science

To be fair, Machine learning seems more closely related to applied mathematics - statistics/optimization than to computer science.

istorical · on March 1, 2018

How does knowing that it's impossible to predict whether an infinite loop exists in a piece of code yield an actionable piece of wisdom that this ML expert should have?

I'd suppose that most developers, formal education or not, would have encountered an infinite loop at some point in their initial work with iteration or recursion.

How does knowing that Turing proved you can't predict this bug in a piece of code change anything?

I might genuinely be missing something important here - not trying to be snarky in my questioning.

It seems like obviously infinite loops are a disastrous bug for critical code - but what does knowing the formal name of the problem and background of its discovery give you?

I could understand if you were arguing in favor of test code or static analysis.

grad_ml · on March 1, 2018

"turing machines" are cs 101?

xamuel · on March 1, 2018

Amended that to "undergraduate computer science"

andreagrandi · on March 2, 2018

> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

It's the main reason why I decided to present a talk at the next PyCon Italy, as a very junior data scientist, to inspire other Python developers to learn some practical machine learning. If I could do it (and use it for a work project already) many other people can do (and no, I don't even have a degree in CS, just years of work experience)

YeGoblynQueenne · on March 2, 2018

>> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

I'm going to have to ask who exactly are those AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience".

Florin_Andrei · on March 1, 2018

TLDR: Taking a class is fine, but nothing beats real-world practice.

Wonder where I've heard this one before. :)

kmax12 · on March 1, 2018

Great to see they have a nice introductory section to feature engineering! Feature engineering is often the most impactful thing you can do to improve quality of models and a place where I often see beginners (and experts for that matter) get stuck. Google walks through how to work with json files and categorical variables https://developers.google.com/machine-learning/crash-course/....

If anyone is looking to get more indepth, I work on an open source python library for automated feature engineering called Featuretools https://github.com/featuretools/featuretools/. It can help when your data is more complex such as when it comprised of multiple tables.

We have several demos you can run yourself to apply it to real datasets here: https://www.featuretools.com/demos.

fnl · on March 1, 2018

Your comment got me interested in this course. However, all I could find about feature engineering there is what you linked to, directly.

Given that entire scientific careers, books, and conferences are built around the topic of feature engineering, and at least IMO good ML tools live or die with good feature engineering (in its broadest sense, for you deep learning fanatics :-)) that doesn't seem like more than the bare minimum I'd expect from any ML "crash-course" that is to be taken serious (and I wouldn't expect an ounce less from Google... :-)).

Am I missing something, maybe?

In any case, nice work of your own, and thanks for sharing it!

kmax12 · on March 2, 2018

10 seconds into the video of feature engineering they say that feature engineering takes up about 75% of the time https://developers.google.com/machine-learning/crash-course/...

They understand the value, but but if you keep watching, they don’t seen go beyond the basic.

minimaxir · on March 1, 2018

Although I'm normally skeptical of AI/ML courses, that section on feature engineering do's-and-do-nots is new and surprisingly under-discussed. It's very useful even outside of AI/ML.

kmax12 · on March 1, 2018

I agree.

I expect that as companies increase their focus on finding practical applications of ML / AI, the topic will start to get more attention in these tutorials, as well as from researchers. Right now, too many people assume you already have a feature matrix, which is rarely the case when working on real world problems.

cjalmeida · on March 1, 2018

OTOH, automating feature engineering is a thing. There are papers on using unsupervised methods to do that.

The 1st place in Kaggle's Porto Seguro competition trained an Autoencoder on raw data to extract features.

cuchoi · on March 2, 2018

How do you select features created with featuretools? The problem with automated feature engineering is that you end with too many irrelevant features, and I haven't found a good guide on feature selection.

eddieplan9 · on March 1, 2018

For those interested in a deeper dive to just deep learning, "Tensorflow and deep learning - without a PhD" is really good, and covers a lot of material in a single 2hr talk.

https://www.youtube.com/watch?v=vq2nnJ4g6N0

null0pointer · on March 2, 2018

+1 for this. Well worth the 2 hours.

mlevental · on March 3, 2018

the deep nets and conv nets stuff was excellent. i wish the explanation of rnns was a little better.

andyjohnson0 · on March 1, 2018

This looks like a well put-together course, and a good way to learn TensorFlow. Keras and TensorFlow are top of my list of technologies to explore in the very near future.

Is anyone here doing Andrew Ng's Machine Learning course [1]? I'm about half-way through and really enjoying it. I'm particularly appreciating that the programming exercises are done in MatLab/Octave, so I feel that I'm really understanding the fundamentals without an API getting in the way, and developing some good intuition. Obviously frameworks are the way to go for production ML work, but I wonder whether ML people here think this bottom-up approach is advisable or could it be misleading when I move on to Keras/TensorFlow/whatever?

[1] https://www.coursera.org/learn/machine-learning

Edit: brevity

bkanber · on March 1, 2018

I teach ML and am currently writing my 2nd book on it.

I always advocate learning the fundamentals. Machine learning is math, and neural networks in particular rely on linear algebra and vector calculus. (You can build a NN without using linear algebra directly, likely it'll be slower and besides, the concept still relies on linear algebra).

Frameworks abstract away a lot of the mathiness, which is a net good for society (ie, exposing lots of developers to neural networks), but I consider that a net-negative for the individual developer.

When working on anything but trivial toy problems, you should make sure you understand your problem domain and implementation thoroughly. Is the activation function you've chosen ideal for your problem domain? If not, choose a better one. If no better one exists, you can invent it; but you'll also need to know how to design the backpropagation algorithm for that new activation function (which requires some vector calculus).

Learning the math, as you have, helps you tune your algorithm based on actual knowledge rather than guesswork. I don't think it will be misleading when you move on to a framework. The frameworks are built on the same math.

That said -- if all you're looking to do is play around, then you don't need the math as much.

andyjohnson0 · on March 1, 2018

Thanks for taking the time to write such a comprehensive reply - much appreciated. "ML is maths" is something that I'm getting used to now. I do have some real uses in mind for what I'm learning' both in my job and some side projects, particularly image feature recognition, and I'm looking forward so seeing how it all works in out. Thanks again!

bkanber · on March 2, 2018

Image feature recognition is not quite solved but I feel it's very close. It's easier, obviously, if the problem domain is very specific.

In the past, like when I started on ML, the best tip was to make sure to do some edge detection with a few convolutions before feeding an image to a neural network. Now, we have convolutional neural networks that kinda do that for you automatically.

Sometime in between those two dates, someone figured out how to get the convolutions trained via backpropagation -- and they did that by deriving the gradient of an arbitrary convolution (or more likely, looking it up). And that let us put convolutions right in the neural net and have the convolutions automatically train themselves along with the rest of the network. And we observe that the convolutions do things that we would do, like remove unnecessary detail and highlight edges or exaggerate colors.

Anyways; I believe the current state-of-the-art for generic image feature recognition is an ensemble of convolutional neural networks. I believe Google leads the pack on the commercial side so maybe look into how they do it.

p1esk · on March 2, 2018

not quite solved is the right way to put it.

If you look at capsules papers, you will realize that convnets are not very good at recognizing transformations (e.g. 3d rotations) of the same object. That's probably why so many training examples are required to make them work well.

Also, if you look at errors made by state of the art models, some of them are obvious (to a human) objects, classified as something entirely different and unrelated. Which leads me to believe that object recognition is not completely solved until a model has some kind of common sense, either build in, or acquired during training.

mark_l_watson · on March 2, 2018

Andrew Ng’s older machine learning MOOC class was excellent. I took it once, and took it again a few years later. In the last 8 months I also took his new deep learning set of courses. All really good stuf! (I have been working in the field since the 1980s, but constantly refreshing helps me. And Andrew’s lectures are great fun, he is a teaching artist.

johnx123-up · on March 2, 2018

I used to download all ML videos for offline ad hoc watching. But, this crash course videos cannot be downloaded using 'youtube-dl'. Any recommendations?

abhishekjha · on March 2, 2018

Where are the videos anyway? I don't see any play button or ay console to play the videos.

choutos · on March 2, 2018

Use the time you spend trying to find how to download the videos actually watching them, doing the readings and practising the exercises.

swyx · on March 2, 2018

i completed the andrew ng course recently, and felt that the difficulty dropped a lot in the second half of the course (for example, he stops giving homeworks). im hoping for more in his new DL courses

Choco31415 · on March 2, 2018

I've taken the CNN and RNN (parts 4 and 5) classes of his new DL specialization, and they both area about as rigorous as you wanted. I do have to give a warning though that the last class starts so see some confusing mistakes in the HW. For example, the expected output given is from an outdated HW version.

swyx · on March 4, 2018

sigh. well i hope the support is good.

bpesquet · on March 1, 2018

The choice of TensorFlow is a bit disappointing for a beginner-focused course which looks really solid otherwise. Business seems to have gotten priority over pedagogy in that case.

I see TensorFlow as the Angular of machine learning: first on the market, powerful but unwiedly. Like Angular, it will ultimately get superseded by tools with a nicer API (scikit-learn, Keras) or more versatility (PyTorch). Like Angular, it's probably not the best choice for a beginner to invest time into.

dragandj · on March 1, 2018

Add to that that TensorFlow was practically a latecomer, not the first to market.

ivan_ah · on March 1, 2018

I love the prework section: https://developers.google.com/machine-learning/crash-course/... It's a very good mix of topic and skills that I think everyone should learn, even if not directly planning to do ML or DL. If y'all are looking for a compact (and inexpensive) textbook on linear algebra that comes with all prerequisites you can check out: https://gum.co/noBSLA (disclaimer: I wrotes it)

chris_va · on March 1, 2018

Link to the exercises: https://developers.google.com/machine-learning/crash-course/...

mrdmnd · on March 1, 2018

As someone who just did this internally:

Do it. It's worth your time. Very well paced exercises, and it walks you through the flow quite nicely.

tostitos1979 · on March 3, 2018

I went through the first a couple of topics. It seemed very disjointed. Different people presenting, different exercises. Was it like this internally? Or is this heavily "annotated"?

throwaway84742 · on March 1, 2018

Unless you have already invested a lot of time into learning (and building on top of) TF, I would advise to pick up PyTorch. It’s much easier to learn and use (imperative!), and has higher performance on common workloads.

jazoom · on March 1, 2018

Except there aren't many good resources to learn it and the documentation isn't very good. Hopefully this will improve soon.

throwaway84742 · on March 2, 2018

On the positive side with PyTorch you don’t need nearly as much documentation as you would with TF. TF in general feels like it’s fighting you every step of the way. There’s a lot of cognitive overhead. Not so with PyTorch. Everything is straightforward, and can be run/examined in ipython.

zitterbewegung · on March 1, 2018

I like this move from google. Sure it is targeted for you to use Tensorflow but more courseware and MOOCs help everyone. I love doing self study and Tensorflow's tutorials are top notch. Since I can also use Tensorflow on my own hardware and anywhere else I really love better docs and MOOCs in general. What I really want to do is understand enough Tensorflow to reproduce other people's experiments in their papers on github and I think this would be one of the best ways to do this. Of course, this may eat into a bunch of companies that have paid programs for ML but its Google's prerogative to make ML cheaper and easier to deploy and learn so I am all for that.

blueside · on March 2, 2018

we recommend that students meet the following prerequisites:Mastery of intro-level algebra. You should be comfortable with variables and coefficients, linear equations, graphs of functions, and histograms

Any book suggestions to getting up to speed in this area?

jpamata · on March 2, 2018

If you're looking for books, have a look at Schaum's Outline of Precalculus [0]. Khan Academy [1] is also good and there's this MOOC on coursera called Data Science Math Skills [2].

[0] https://www.amazon.com/Schaums-Outline-Precalculus-3rd-Probl...

[1] https://www.khanacademy.org/math

[2] https://www.coursera.org/learn/datasciencemathskills

ropable · on March 2, 2018

Khan Academy probably has this covered.

toomuchtodo · on March 2, 2018

A shame Google doesn’t just link to the Khan Academy course.

earth2mars · on March 2, 2018

not true. they do have khan academy link where applicable. for example for the algebra ones. check below link https://developers.google.com/machine-learning/crash-course/...

megaman22 · on March 2, 2018

I always liked the Saxon books[1], since they involved so much spaced repetition if you did the problem sets that it beat the symbolic manipulation into your long-term memory.

[1] http://amzn.to/2FH3bXL

austenallred · on March 1, 2018

Shameless plug: Lambda School (YC S17) is also putting on a free Machine Learning crash course (we call it a mini bootcamp), followed by an optional 6-12 month course that you pay for once you get a job in data science (it’s free until then, and always free if you don’t get a job in ML).

https://lambdaschool.com/machine-learning-bootcamp/

eorge_g · on March 1, 2018

is there a more fleshed out outline for what will be covered here? sounds interesting

redditmigrant · on March 1, 2018

As someone who is trying to learn ML, all the courses available are hugely helpful. One thing I wish I had easy access to is the process that someone goes through while trying to build a model on a real dataset.

Specifically following questions are the ones I struggle with:

1. How did you figure out what features would be useful?

2. How did you figure out what algorithm(s) are appropriate?

3. how and why did you massage the data in a specific way?

bkanber · on March 1, 2018

> How did you figure out what features would be useful?

There are various feature engineering and feature extraction techniques. Filter methods, wrapper methods, and embedded methods. Principle component analysis, autoencoding, variance analysis, linear discriminant analysis, Gini index, genetic algorithms, etc -- the feature selection process will depend on the dataset, the problem domain, the analysis algorithm you ultimately use, etc.

> How did you figure out what algorithm(s) are appropriate?

Also depends on the problem domain. Discrete or continuous data? Categorical features, numeric features, features as bitmasks. Do you need a probabilistic outcome? Etc.

Generally you start with the easiest algorithms in your toolbox to see how viable they are. For a classification task I'll almost always start with a naive Bayes classifier (if the data allows) and/or a random forest and see how they perform. If the problem domain is highly non-linear you might start with a support vector or kernel method. Neural network is a last resort for me, as I find most classification problems can be solved to a high accuracy much more simply.

> how and why did you massage the data in a specific way?

This relates back to #1 -- you should only massage data based on what your feature engineering tells you to do. Sometimes you might want to remove outliers or clean up the training data, but only if the outliers really should be removed from consideration entirely.

redditmigrant · on March 1, 2018

Thanks for the response!

> There are various feature engineering and feature extraction techniques. Filter methods, wrapper methods, and embedded methods. Principle component analysis, autoencoding, variance analysis, linear discriminant analysis, Gini index, genetic algorithms, etc -- the feature selection process will depend on the dataset, the problem domain, the analysis algorithm you ultimately use, etc.

Obviously thats a big toolbox and Im sure it takes time to develop an intuitive understanding for all these techniques. What I hope for is some sort guidebook on what to look for when I stumble across problems. So lets say you try out an algorithm and your accuracy(or whatever evaluation criteria you might have) is low. How do you figure out if thats due to the algorithm, or is it due to (or due to the lack of) feature selection?

An analogy that might be useful is, when I see my database queries are slow, I can use EXPLAIN to guide what knobs to tune. Obviously it requires understanding what indexes are, what a full table scan is etc. etc. but the EXPLAIN plan provides a guidebook of sorts.

bkanber · on March 2, 2018

Every problem is different, so the only advice I can give is: research research research! Do the hard work up-front; figure out how to describe your problem in a mathematical sense, and identify the right tools to use for the shape of your input, output and problem dimensions. What's the distribution of each dimension. Are the relationships linear, nonlinear, clustered, dispersed, logarithmic, etc. Once you know those things, you're able to narrow in on the right tools and analyses to use.

disgruntledphd2 · on March 1, 2018

If you are willing to do the work, Frank Harrell's Regression Modeling Strategies is a pretty good introduction to a lot of this.

It's written for a very different set of problems than typical ML, but it has lots of really good advice for practical problems in data analysis and prediction (which is another term for ML).

Mostly people learn this stuff by experience. Find a dataset, choose a predictor, filter, clean and massage your data till you get better metrics/understanding (preferably both). Rinse, repeat on many different datasets and problems, and you'll know how to do this.

rripken · on March 1, 2018

Georgia Tech has an graduate course on Machine Learning CS-7641. There are four major projects in that course where the students must analyze (and re-analyze) a chosen dataset. Here is an example of the code one student used: https://github.com/JonathanTay/CS-7641-assignment-1 Unfortunately all the plotting code was intentionally removed. Sometimes the project reports make it online (http://www.dudonwai.com/docs/gt-omscs-cs7641-a3.pdf?pdf=gt-o...) . Having spent several months of my life on the assignments I'd say that only way to learn it is to try a whole bunch of different things and try and figure out why some work and why some don't. Sometimes you learn from the failures, sometimes from the unexpected successes.

wepple · on March 2, 2018

Take a look at some of the highly rated kernels on Kaggle - they’re often well annotated with the types of things you’re looking for, including actual experimentation to test ideas.

Edit: fix autocorrect

rohitpaulk · on March 1, 2018

I've compiled this into a Todoist template you can import - it's got links to each module + times.

To preview: http://todotemplates.com/posts/HRtYanEq8zMgRL5fz/google-ml-c...

To import directly: https://todoist.com/importFromTemplate?t_url=https%3A%2F%2Fd...

ak_yo · on March 2, 2018

Cool tutorial, but I'm not entirely sure what makes this ML -- aside from neural nets, this is more or less the material you'd encounter in a basic applied statistics or regression analysis course, minus material on estimating uncertainty, modeling survival or time-series data, and causal inference. I suspect you'd benefit more from a 50 minute tutorial on those than neural nets.

bjourne · on March 1, 2018

I want to ask people who know ML well if the hype is warranted?

Billions of courses, web sites, job applications and HN posts. The subject seem to have taken off massively in the last two years. I mean image and speech recognition is pretty cool (when it works!), but hardly that earth shattering, is it?

randcraw · on March 1, 2018

Deep nets are deservedly big because they've managed to improve upon most of the decades-old state-of-the-art methods in the world of signal processing (DSP): voice, image, video, game play, and a significant amount of natural language. No other single computational/algorithmic method has achieved so much in so many domains, ever. That's revolutionary.

The rate of advance using deep nets in signal processing will likely slow down now, but they aren't going away, not in the foreseeable future.

The hype around DNNs arose when we took our unbridled enthusiasm for what's they've achieved in DSP and extended it to other domains where data is less 'dense' and thus aren't as amenable to fast de/convolution in N-D space or time.

Will DNNs revolutionize or introduce all the techniques needed to achieve AGI/Strong AI? I very much doubt it. As yet, there's little sign that DNNs can perform relational operations on interdependent symbols, like the transforms available via type theory, bayesian nets, or predicate logic.

The multitude of disparate facts and semantics in a rich knowledgebase can't be organized into dense matrices the way that continuous signals can, so the SIMD operations that are so effective in DSP won't implement the rich transformations needed in a relational fact-based knowledge space equally as well, if at all. Thus DNNs almost surely aren't going to take us to the heights of logical or compositional thinking that human level intelligence requires.

But how far up relational mountain will DNNs take us? I suspect that won't be known for a decade or longer. But even if we don't reach the summit, we'll be considerably closer than we were before.

wepple · on March 2, 2018

This is a really fantastic and interesting look at ML, for someone who’s just beginning first steps. Any recommendations on where else I can read about the areas DNNs and associated stuff (GANs, RL, etc) in terms of what they’ll likely not be capable of in the near mid term?

swyx · on March 2, 2018

just commenting to thank you for a very accessible assessment of a simple question "is the hype warranted". please do more of these on HN (or elsewhere!)

bkanber · on March 1, 2018

The hype is and isn't warranted.

ML is a much broader field than just neural networks. The hype for ML, in general, I think is warranted. We hit an inflection point when AWS launched and scalable processing power became cheap. It became cheap to process tons of data and generate insights. I don't have hard numbers on this, but probably 90-95% of machine learning used in practice is NOT neural networks, and have accuracies in the 90%+ arena. So ML in general -- sure, hype warranted.

Neural networks are the new hot topic, and the hype isn't fully warranted yet. TensorFlow made them very popular in the developer community; this is a good thing because it's spurring more investment and research in ANNs. But for any given problem, odds are that a neural network is not the best (ie, most accurate or cheapest) way to solve it. Neural networks do have specific problem domains where they are the state of the art, but for most other problem domains there exists a better solution. So I'd say that neural networks are a little over-hyped right now, but with a new generation of developers learning about and experimenting with ANNs, that will change in a few years. I think we're about to see an explosion of ANN usefulness over the next few years.

TLDR: ML is very useful but is more than neural networks; neural networks need a little more progress to catch up to the tensorflow hype.

ageitgey · on March 1, 2018

Computers are automation tools that increase human efficiency by doing the grunt work for you - but they are limited to automating the tasks that can be captured as a set of rules in code. When we figure a new way to model more complex tasks in code, a whole new set of things can be automated.

Here's a concrete example: Before spreadsheets existed, there used to be legions of accountants who created complex ledgers on paper and added up all numbers to track how a business was doing. You'd literally mail off your sales numbers to an accounting team somewhere and wait three days to get the latest report generated and sent back. Sure they had calculators to add numbers, but the computers of the day didn't understand how those numbers related to each other. The human still had to do most of the work to create the reports.

The big idea of spreadsheets was to make the computer manage the more complex task of knowing how different numbers in a report related to each other. It made most ledger tasks totally automatic once the initial report was defined. Now a single accountant could do the work of the entire accounting team - and more accurately and in less time! There were stories of the first spreadsheet testers having to delay mailing back their financial reports by a few days because their clients would be suspicious if they mailed them back too fast.

Nearly overnight accounting got a lot more efficient and companies made more money. T"What If" modeling that used to be too slow and cost prohibitive to do was now it was quick and easy. Companies could plan more intelligently. The spreadsheet was a true game changer.

This same pattern happens every time the bar is raised on the complexity of what can be automated and Machine Learning raises the bar one giant notch. Previously we were limited to automating tasks that a smart coder could describe as discrete steps in code. But with ML, the computer can figure out it's own rules just by looking at data. That means in many cases you can solve very hard problems just by collecting a lot of data. Lots and lots of things that used to be done by large groups of people will now be able to be done with a single computer.

In that sense, ML is a total game changer. Don't focus on the specific applications thus far. Focus on the idea that all kinds of tasks that used to require humans can now be automated with a little bit of applied ML. The opportunities are literally everywhere.

In a few years, ML won't be some esoteric technique used by a few people. It will be a core skill that everyone uses or touches in some way. It's going to creep into everything everywhere because it's just so darn useful.

randcraw · on March 1, 2018

Deep nets are deservedly big because they've managed to improve upon all of the decades-old state-of-the-art methods in the world of signal processing (DSP): voice, image, video, game play, and a much of natural language. No other single computational/algorithmic method has achieved so much, in so many domains, ever. That's revolutionary.

The advances made by deep nets in signal processing will likely slow down now, but they aren't going away, not in the foreseeable future.

The hype around DNNs arose when we took our unbridled enthusiasm for what's they've achieved in DSP and extend it to other domains with data that's less 'dense' and thus aren't as amenable to de/convolution in N-D space or time.

Will DNNs revolutionize or introduce all the techniques needed to achieve AGI/Strong AI? I very much doubt it. As yet, there's little sign that DNNs can perform relational operations on interdependent symbols, like the transforms available via type theory, bayesian nets, or predicate logic.

The multitude of disparate facts and semantics in a rich knowledgebase can't be organized into dense matrices the way that continuous signals can, so the SIMD operations that are so effective in DSP won't implement the rich transformations needed in a relational fact-based knowledge space equally as well, if at all. Thus DNNs almost surely aren't going to take us to the heights of logical or compositional thinking that human level intelligence requires.

But how far up relational mountain will DNNs take us? I suspect that won't be known for a decade or longer. But even if we don't reach the summit, it'll be higher than we were before.

penetrarthur · on March 1, 2018

I've been reading about it "getting popular during the last two years" for at least 6 years.

kidfiji · on March 1, 2018

I've been itching to learn a bit about the industry and to be able to create & train ML models myself; I'm glad Google decided to put out a course where I wouldn't have to worry about the quality of instruction.

msaharia · on March 2, 2018

On a sidenote, can someone talk about what kind of tools are being used to integrate the subtitles and scrolling behavior over a youtube video in this course? Is there an OS implementation?

rtfs · on March 1, 2018

Thanks Google! Now I know that I am a ML guy, as an economist and econometrician. Yes, we shoot this on all kind of stuff, though with a clear business acumen or economic policy thinking.

Abundnce10 · on March 1, 2018

I have a new project at work: I need to take in a free form text of recipe ingredients (e.g. "1/2 cup diced onions", "two potatoes, cut into 1-inch cubes", etc.) and build a program that identifies the ingredient (e.g. onion, potato), as well as the quantity (e.g. 0.5 cup, 2.0 units). Would machine learning be an applicable approach to solving this? Right now I'm just planning on using an NLP library to parse out the various parts of the ingredient text.

imh · on March 1, 2018

I did the same a while back, and i suggest using an NLP library to extract parts of speech and parse trees and building a quick dirty solution. I did the same a while back and the strong solution isn't much better (took a week+) than the hacky manual one based on specific keywords like "teaspoon" and parts of speech/parse trees (took a few hours).

maffydub · on March 1, 2018

It's not very sexy, but I think you might find it easier and more robust just to use an NLP library.

I built something similar (albeit for a relatively limited database of recipes) for a hackathon a couple of weeks back. I didn't even use a proper NLP library, just some simple hand-rolled pattern-matching, and got pretty good results.

Good luck!

Abundnce10 · on March 1, 2018

I think you're right. Did you happen to open-source your code from the hackathon? I'd love to take a look at your approach if you don't mind.

maffydub · on March 1, 2018

Sorry, I normally would but one of the other team members is considering taking the hack forward and wanted to keep it closed for now. (It's hard to see how much competitive advantage he'd have from 48 hours of very-hacked-together code, but so few hackathon projects get taken forward that I didn't want to discourage him!)

The approach was to tokenize the input and then do basic pattern-matching on it, with separate dictionaries of quantity units (e.g. cup, oz, pound) ingredients, processing words (e.g. "chopped") and throw-away words (e.g. "of"). In fact, possibly the most complicated part was parsing "2.5", "2 and a half" and "2½" all to the same thing.

radarsat1 · on March 2, 2018

Whether you end up using a machine learning approach or hand-crafting the solution, I recommend you work in a ML-like manner, dividing up the data you have into test and training sets and using cross-validation to evaluate your work.

For you actual question, yes, as others have said it might be just an NLP/regexp problem. Otherwise, you could look at ingredients identification as a classification approach. I recommend checking FastText, NLTK, familiarize yourself with word dictionaries and pre-trained vectors that are available, these tools might help generalize your work beyond the data you have at hand.

(E.g. if it works well on your data using pre-trained word vectors from wikipedia, chances are it might work on examples you don't even have.)

telchar · on March 1, 2018

This seems relevant: https://open.blogs.nytimes.com/2015/04/09/extracting-structu...

bkanber · on March 1, 2018

This is an NLP problem if all you're trying to do is extract nouns.

DeepWorker · on March 6, 2018

The ulterior motive behind this is to increase usage of Google Cloud (according to this answer on Quora: https://www.quora.com/Why-did-Google-release-their-machine-l...)

qwerty456127 · on March 1, 2018

Cool! Does anybody also know a good blockchain crash course of similar kind so one could grok all the major buzzwords of today?

lanewinfield · on March 1, 2018

I loved this one a lot: https://anders.com/blockchain/

abhishekjha · on March 2, 2018

This is what got me a proper intuition for why a blockchain can be useful.

earth2mars · on March 2, 2018

They do have pre-reqs training material with reference links. https://developers.google.com/machine-learning/crash-course/...

aabajian · on March 2, 2018

While I like the idea, in principle, that you don't need a CS education to use AI/ML, I doubt it. Here's a problem that cropped up today: Our instance ran out of hard drive space on a training set of ~400,000 images. The individual images were only 375 GB, but took up 1.5T when converted to Numpy matrices. Why? The arrays were converted to standard int arrays (32-bit x 3 channels) when they could've fit into short (8-bit x 3 channels). Each image was 4x as large as it needed to be.

You can certainly use high-level ML tools (like Keras), but it takes a great deal of work to wrangle your data into a usable format, and even more knowledge to debug an ineffective network.

zengid · on March 1, 2018

I wonder if they are doing this to complete with course.fast.ai

bllguo · on March 1, 2018

IMO teaching people ML is good in general for Google. 1. it spreads the use of tensorflow, 2. it increases not only tensorflow, but also Google's mindshare, 3. it trains people that may become future Google employees, and/or serves as a useful resource for existing employees

jxub · on March 1, 2018

Also 4) will increment the usage of TPU's on Google Cloud Platform and subsequently the revenue of their cloud offerings.

zeroxfe · on March 1, 2018

This has been an internal course at Google for many years.

chishaku · on March 2, 2018

I think that helps to prove the point.

juanmirocks · on March 1, 2018

The Google AI courses are also good for old seasoned ML practitioners who want to learn more about more recent deep learning techniques

ariwilson · on March 1, 2018

Having done this course 6 months ago, this is a fantastic introduction to the major concerns of practical machine learning.

Animats · on March 2, 2018

Can't view. Requires a Google account.

Tepix · on March 2, 2018

Have you tried to create one?

DiogoMCampos · on March 1, 2018

How does this compare to the CS229 lectures by Andrew Ng? (the recorded lectures, not the MOOC)

manav · on March 1, 2018

CS229 @ Stanford is very math/proof intensive. This might be somewhat similar to CS221 or the Coursera course.