Ask HN: How do I start with Machine Learning?

wcsun · on June 12, 2011

First, finish lectures by Professor Gilbert Strang. http://web.mit.edu/18.06/www/

To my memory, session notes of CS229 is good enough for understanding SVM and gaussian distributions. Also watch youtube videos. http://www.stanford.edu/class/cs229/materials.html http://www.youtube.com/watch?v=UzxYlbK2c7E

If you just want to use the libraries, you can stop here.

If you want to know more, read chapters 1-3 of nonlinear programming by Professor Dimitri Bertsekas before convex optimization. http://www.athenasc.com/nonlinbook.html

Then, you can try to finish EE364 and watch the videos. http://www.stanford.edu/class/ee364a/ http://www.youtube.com/watch?v=McLq1hEq3UY

If you want to roll your own algorithms, you have to know some optimization tools. http://cvxr.com/cvx/

And there is some statistics knowledge you have to fill in. I used these: http://www.stat.umn.edu/geyer/5101/ http://www.stat.umn.edu/geyer/5102/ R is used in the courses.

phektus · on June 12, 2011

Thanks! For an achievable short term goal I just wish to use the libraries first so I can roll my own simple apps. This way I get to learn the basics while making use of what I already know (build web apps). An integration of sorts, should keep me motivated all throughout. Eventually I'll go deeper, and will definitely work on the advanced topics you posted.

Wump · on June 12, 2011

Programming Collective Intelligence (http://www.amazon.com/Programming-Collective-Intelligence-Bu...) is a great resource. It serves as a practical introduction to several different machine learning algorithms. Although they are presented from a specific perspective (collaborative filtering), many of the techniques are general and are used across machine learning. The explanations of the algorithms are clear, simple, and the author does a nice job of building up the level of complexity over the course of the book. Also, you will get much more out of it if you follow along with the provided python-based implementations.

phektus · on June 12, 2011

This is great! Python is actually my favorite language as of the moment, using it on freelance work as well as personal projects. Thanks for the link

dvse · on June 12, 2011

Start with the MIT linear algebra course (18.06) by Gilbert Strang and Stanford course on linear dynamical systems (EE263) by Stephen Boyd. Then move on to Boyd's course on convex optimization (EE364). Lectures for all of these are on youtube.

Do not try to read any books on "machine learning" (most of which are a total mess) before you have this background or you will just end up hopelessly confused.

phektus · on June 12, 2011

Great! thanks for the tip, didn't realize youtube could be a better help to me that most books about the subject at my level

siddhant · on June 12, 2011

Take a look at Machine Learning video lectures by Professor Andrew Ng (Stanford). Highly recommended. http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599

You'll also find a bunch of resources on this page - http://www.quora.com/Machine-Learning/What-are-some-good-res...

vecter · on June 12, 2011

I highly recommend Andrew Ng's lectures also. I was going through them myself, and he does a good job of explaining the intuition behind many things (of which the math, although important, is only a formalization of).

One piece of advice I would give from my own experience is that you have to play with data to get practical experience applying the methods. Machine learning is not a set of plug and play blackboxes that you can feed random input into and get clean output. You have to spend a lot of effort understanding your data and how they relate to the specific method you're using. For example, if you use linear regression as your learning model, you have to understand what kind of relationship is assumed between the inputs and outputs (in this case, that the output is a linear combination of the inputs).

I know this because when I started, I would just toss unclean, unfiltered, and untransformed data into a method and hope for good results. Of course I fed garbage in, so I got garbage out.

Another word of advice is to watch out for overfitting. Often, you'll find that your training gives you good in-sample statistics (for example, with linear regression you'll get great R^2 with high p-values). However, when you test out of sample, you'll realize quickly that most of the models you've fit are overfit to the data that you trained on. Just something to be aware of.

I guess both of these may be very abstract and useless for you right now, but hopefully one day you'll look back and able to find use for it.

phektus · on June 12, 2011

Yeap they are indeed abstract concepts to me right now, but it's great to have them mentioned, because they'll definitely come in handy later. thanks!

phektus · on June 12, 2011

Thanks! actually saw the first A.Ng lectures already, and he framed all concepts by way of linear algebra, which made me stumble on just the second video

dvse · on June 12, 2011

Definitely give the courses I've mentioned in another comment a try - the material will make a lot more sense. Also keep in mind that Andrew Ng is a rather poor lecturer and many of his explanations are unclear or incomplete.

phektus · on June 12, 2011

will do! thanks!

earl · on June 12, 2011

Then you know where to start. Read a good linear algebra text; I'd suggest Strang. He also has lectures online.

If you stumble on those, do what I just suggested for whatever you stumble on.

phektus · on June 12, 2011

a recursive solution (or is it?), thanks!

kubrickslair · on June 12, 2011

It depends on what you want to do as a researcher. If you want to prove theorems in machine learning theory, courses like convex optimization make sense. I would also add statistical machine learning (CMU 10-702) for a broader view.

But a large part of ML research is not theoretical stuff, and involves building real systems. And in that case you should get a cursory overview of ML and then focus on the subdomain you may be interested in. By cursory overview I mean getting a gist of things like graphical models, SVMs etc.- a typical entry level grad ML course. This should be enough when you delve deep into your domain of choice.

dvse · on June 12, 2011

Not sure this is good advice - without background I have mentioned there is no chance of actually making sense of the material in the CMU course - note how linear algebra is a prerequisite and they cover convex optimization at the very beginning of the class (even though it is not possible to do the subject justice in 2 lectures).

kubrickslair · on June 12, 2011

Yes, you are correct. Sorry for the confusion; I meant that courses like 10-702 should be the star course he should work at, to have some short at understanding and contributing to theory heavy side of ML.

The first basic course for the latter approach can be 10-601, it's a machine learning course meant for senior undergrads. Look at schedule and assignments.

http://www.cs.cmu.edu/~roni/10601-s10/

Also since you have interest in text mining etc. and if you want to focus on that, you can skip a lot of courses, and go directly to language and statistics-1. It's a great course if you wish to work primarily with text, and it covers the prerequisite ML.

http://www.cs.cmu.edu/~roni/11761/

phektus · on June 12, 2011

cool, thanks! bookmarking those, especially great for the text mining related course

kubrickslair · on June 12, 2011

If you are interested in a course, save it's slides and assignments. They will be down in a few months. They keep recycling the course sites whenever a new semester starts.

phektus · on June 12, 2011

thanks! thought of asking what an SVM is but I guess google gave it up as first result:

http://en.wikipedia.org/wiki/Support_vector_machine

tumanian · on June 13, 2011

Check out this: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122...

tumanian · on June 12, 2011

A great textbook as an intro is by Duda and Hart, Pattern Classification. http://www.amazon.com/Pattern-Classification-2nd-Richard-Dud.... Its pretty well written and gives a good overview of the main techniques. If you want a bit more theory, try Cherkassky and Muller, "Learning from Data".http://www.amazon.com/Learning-Data-Concepts-Theory-Methods/ Has a good overview section on statistical learning theory. And also, take WEKA and just play with it.Its nice to just check what works and what doesn't.

phektus · on June 12, 2011

These are great books, thanks! Haven't heard of WEKA but it sure looks pretty nice, must like like MLDemos?

http://www.cs.waikato.ac.nz/ml/weka/

bravura · on June 12, 2011

What are the best resources to use when starting machine learning for an experienced programmer?

http://metaoptimize.com/qa/questions/334/what-are-the-best-r...

As other commentators have said, try to build something and ask for help along the way, unless your goal is to be a theoretician (which I assume it is not).

phektus · on June 12, 2011

Thanks for the link! You're mostly correct, I wish to build stuff around, but I also want to get deep into the theory. Just deep enough to be able to try out stuff I guess.

ynn4k · on June 12, 2011

Machine learning consists of parametric optimization to reduce some error function on training data and then use the learnt parametric model on unseen/held-out data and evaluate. The first step is to construct the right parametric model by studying the data domain and then iterate till the performance is achieved within acceptable level. Machine learning research is highly mathematical, but you can start by using some open source ML tools and tweaking the models to get a feel for the capacities of different models.

Some topics you should familiarize are: Probability Theory, EVD/SVD, ANN, ML/MAP estimation, Minimum classification error training, SVM, LMS fitting, PCA/ICA, FSM and HMM.

phektus · on June 12, 2011

Thanks! Once I brushed up on the math, I'll be diving deeper onto those topics, which seemed to be math-intensive

dstein64 · on June 12, 2011

I have not checked these out, but I know that there is a Hacker Dojo course that posts course material to the following site.

http://machinelearning101.pbworks.com/w/page/32890312/FrontP...

I have seen some of the lectures and notes posted on the following Stanford CS229 site. However, they will probably be hard to follow prior to learning some linear algebra.

http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a...

phektus · on June 12, 2011

going to my bookmarks, and, yeah, got to get that linear algebra problem sorted out, could take me months

dstein64 · on June 12, 2011

The Khan Academy site has linear algebra videos for free: http://www.khanacademy.org/#linear-algebra

Also, Octave is a free software package similar to Matlab that I imagine could be useful when learning linear algebra, to see instant results for problems that you are trying to solve without a computer.

pgbovine · on June 12, 2011

MLDemos is a great free cross-platform GUI program that you can use to play around with various algorithms and visualize their effects: http://mldemos.epfl.ch/

phektus · on June 12, 2011

This looks sleek! thanks!

helwr · on June 12, 2011

some good toy projects to start with: http://www.quora.com/Programming-Challenges-1/What-are-some-...

for more see http://www.quora.com/Machine-Learning/What-are-some-good-lea...

phektus · on June 12, 2011

bookmarked, Thanks!

sameep · on June 12, 2011

For a broad(er) perspective on AI, checkout UC Berkeley's CS188 http://inst.eecs.berkeley.edu/~cs188. I think it's a good entry course into AI. One advantage is that the math requirement is not as high as Stanford's CS229 or EE263 (both of which are fantastic courses to be clear, but are easier to appreciate with the correct background).

phektus · on June 12, 2011

Bookmarked! Thanks!

I know Machine Learning is a subset of AI, but lately I'm beginning to see it more of under Statistics. That's just my impression, I could really be wrong given my very limited knowledge.

sameep · on June 12, 2011

Yep your intuition is good -- a lot of the mathematical techniques used in Machine Learning are motivated/derived/understood from Statistics. I'd recommend starting with a broader approach to this topic. In particular, there are often simpler heuristic approaches to a problem that are reasonably good and worth trying before trying to build a full-blown ML system. What I've learned is that knowing where ML systems fail/are overkill is just as important as knowing when to use them/build them.

disgruntledphd · on June 12, 2011

If you have already brushed up on linear algebra and calculus (which you're gonna need if you want to do any serious ML) take a look at Hastie et al's Elements of Statistical Learning. The PDF is free, and the book is both extremely well written and super comprehensive. http://www-stat.stanford.edu/~tibs/ElemStatLearn/

You might also want to check out R, as its an amazing statistics language which has hundreds of packages available for ML. There's a large user community, and the really obscure error messages you get will teach you a lot about statistics. http://cran.r-project.org/

Also, a lot of machine learning is getting the data into a usable form, so learn how to use Unix command line tools such as sed, awk, grep et al. They are absolute lifesavers.

derrida · on June 12, 2011

On the other issue, re the 'horrible at math', if you can manage, work through 'Project Euler'(google it), say one a day or 3-4 a week. If you can do more, great, but the aim is to get a long-term habit going. Get into the habit of doing a few of those problems a week and you will have a working capacity with 'real math', which is more than remembering formulas.

phektus · on June 12, 2011

Yeah I already played around that site when learning Python. I actually got until (I think) problem #34. Then I sort of stopped, because I can't solved it without resorting to endless nested loops, which made me think I need to brush up on my mathematics because it is somehow limiting me when dealing with some problems. Kind of a chicken and egg situation as far as I could tell now that you mentioned it.

Tichy · on June 12, 2011

Wish I would finally follow my own advice, but I think it would be best to pick a problem and start hacking. There are lots of data sets available on the net, and competitions like the Netflix prize, too.

A while ago there was a story on HN about somebody who built a recommender system for boardgamegeek.com, for example. I thought that was inspiring.

phektus · on June 12, 2011

I believe this to be sound advice, so what I'm actually thinking of right now is to go and pace slowly on the math/theory, while starting a super beginner project to get my feet wet.

alantrrs · on June 12, 2011

MIT open course ware. Its like going to college, but free xD http://ocw.mit.edu/courses/electrical-engineering-and-comput...