Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do I start with Machine Learning?
51 points by phektus on June 12, 2011 | hide | past | favorite | 43 comments
My last company dabbled with the usual web data mining, but with some text mining / NLP twist for ads. Then later tried to apply machine learning but it closed down on our locality.

I wish to continue learning and trying to research on it. I am horrible at math, but I'm trying to change that, so I'm trying to re-learn high school algebra as a starting point. I know some machine learning resources on the web speak heavily using linear algebra, so that would be my long term goal.

What I wish to be is an active researcher on the field of machine learning. An independent researcher, which I know is possible since, from what I understood, one doesn't need government approval or very huge funds to start with it.

Going to university is out of the question. I have to do researching myself with my own resources.

Any help would be highly appreciated ;-)




First, finish lectures by Professor Gilbert Strang. http://web.mit.edu/18.06/www/

To my memory, session notes of CS229 is good enough for understanding SVM and gaussian distributions. Also watch youtube videos. http://www.stanford.edu/class/cs229/materials.html http://www.youtube.com/watch?v=UzxYlbK2c7E

If you just want to use the libraries, you can stop here.

If you want to know more, read chapters 1-3 of nonlinear programming by Professor Dimitri Bertsekas before convex optimization. http://www.athenasc.com/nonlinbook.html

Then, you can try to finish EE364 and watch the videos. http://www.stanford.edu/class/ee364a/ http://www.youtube.com/watch?v=McLq1hEq3UY

If you want to roll your own algorithms, you have to know some optimization tools. http://cvxr.com/cvx/

And there is some statistics knowledge you have to fill in. I used these: http://www.stat.umn.edu/geyer/5101/ http://www.stat.umn.edu/geyer/5102/ R is used in the courses.


Thanks! For an achievable short term goal I just wish to use the libraries first so I can roll my own simple apps. This way I get to learn the basics while making use of what I already know (build web apps). An integration of sorts, should keep me motivated all throughout. Eventually I'll go deeper, and will definitely work on the advanced topics you posted.


Programming Collective Intelligence (http://www.amazon.com/Programming-Collective-Intelligence-Bu...) is a great resource. It serves as a practical introduction to several different machine learning algorithms. Although they are presented from a specific perspective (collaborative filtering), many of the techniques are general and are used across machine learning. The explanations of the algorithms are clear, simple, and the author does a nice job of building up the level of complexity over the course of the book. Also, you will get much more out of it if you follow along with the provided python-based implementations.


This is great! Python is actually my favorite language as of the moment, using it on freelance work as well as personal projects. Thanks for the link


Start with the MIT linear algebra course (18.06) by Gilbert Strang and Stanford course on linear dynamical systems (EE263) by Stephen Boyd. Then move on to Boyd's course on convex optimization (EE364). Lectures for all of these are on youtube.

Do not try to read any books on "machine learning" (most of which are a total mess) before you have this background or you will just end up hopelessly confused.


Great! thanks for the tip, didn't realize youtube could be a better help to me that most books about the subject at my level


Take a look at Machine Learning video lectures by Professor Andrew Ng (Stanford). Highly recommended. http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599

You'll also find a bunch of resources on this page - http://www.quora.com/Machine-Learning/What-are-some-good-res...


I highly recommend Andrew Ng's lectures also. I was going through them myself, and he does a good job of explaining the intuition behind many things (of which the math, although important, is only a formalization of).

One piece of advice I would give from my own experience is that you have to play with data to get practical experience applying the methods. Machine learning is not a set of plug and play blackboxes that you can feed random input into and get clean output. You have to spend a lot of effort understanding your data and how they relate to the specific method you're using. For example, if you use linear regression as your learning model, you have to understand what kind of relationship is assumed between the inputs and outputs (in this case, that the output is a linear combination of the inputs).

I know this because when I started, I would just toss unclean, unfiltered, and untransformed data into a method and hope for good results. Of course I fed garbage in, so I got garbage out.

Another word of advice is to watch out for overfitting. Often, you'll find that your training gives you good in-sample statistics (for example, with linear regression you'll get great R^2 with high p-values). However, when you test out of sample, you'll realize quickly that most of the models you've fit are overfit to the data that you trained on. Just something to be aware of.

I guess both of these may be very abstract and useless for you right now, but hopefully one day you'll look back and able to find use for it.


Yeap they are indeed abstract concepts to me right now, but it's great to have them mentioned, because they'll definitely come in handy later. thanks!


Thanks! actually saw the first A.Ng lectures already, and he framed all concepts by way of linear algebra, which made me stumble on just the second video


Definitely give the courses I've mentioned in another comment a try - the material will make a lot more sense. Also keep in mind that Andrew Ng is a rather poor lecturer and many of his explanations are unclear or incomplete.


will do! thanks!


Then you know where to start. Read a good linear algebra text; I'd suggest Strang. He also has lectures online.

If you stumble on those, do what I just suggested for whatever you stumble on.


a recursive solution (or is it?), thanks!


It depends on what you want to do as a researcher. If you want to prove theorems in machine learning theory, courses like convex optimization make sense. I would also add statistical machine learning (CMU 10-702) for a broader view.

But a large part of ML research is not theoretical stuff, and involves building real systems. And in that case you should get a cursory overview of ML and then focus on the subdomain you may be interested in. By cursory overview I mean getting a gist of things like graphical models, SVMs etc.- a typical entry level grad ML course. This should be enough when you delve deep into your domain of choice.


Not sure this is good advice - without background I have mentioned there is no chance of actually making sense of the material in the CMU course - note how linear algebra is a prerequisite and they cover convex optimization at the very beginning of the class (even though it is not possible to do the subject justice in 2 lectures).


Yes, you are correct. Sorry for the confusion; I meant that courses like 10-702 should be the star course he should work at, to have some short at understanding and contributing to theory heavy side of ML.

The first basic course for the latter approach can be 10-601, it's a machine learning course meant for senior undergrads. Look at schedule and assignments.

http://www.cs.cmu.edu/~roni/10601-s10/

Also since you have interest in text mining etc. and if you want to focus on that, you can skip a lot of courses, and go directly to language and statistics-1. It's a great course if you wish to work primarily with text, and it covers the prerequisite ML.

http://www.cs.cmu.edu/~roni/11761/


cool, thanks! bookmarking those, especially great for the text mining related course


If you are interested in a course, save it's slides and assignments. They will be down in a few months. They keep recycling the course sites whenever a new semester starts.


thanks! thought of asking what an SVM is but I guess google gave it up as first result:

http://en.wikipedia.org/wiki/Support_vector_machine



A great textbook as an intro is by Duda and Hart, Pattern Classification. http://www.amazon.com/Pattern-Classification-2nd-Richard-Dud.... Its pretty well written and gives a good overview of the main techniques. If you want a bit more theory, try Cherkassky and Muller, "Learning from Data".http://www.amazon.com/Learning-Data-Concepts-Theory-Methods/ Has a good overview section on statistical learning theory. And also, take WEKA and just play with it.Its nice to just check what works and what doesn't.


These are great books, thanks! Haven't heard of WEKA but it sure looks pretty nice, must like like MLDemos?

http://www.cs.waikato.ac.nz/ml/weka/


What are the best resources to use when starting machine learning for an experienced programmer?

http://metaoptimize.com/qa/questions/334/what-are-the-best-r...

As other commentators have said, try to build something and ask for help along the way, unless your goal is to be a theoretician (which I assume it is not).


Thanks for the link! You're mostly correct, I wish to build stuff around, but I also want to get deep into the theory. Just deep enough to be able to try out stuff I guess.


Machine learning consists of parametric optimization to reduce some error function on training data and then use the learnt parametric model on unseen/held-out data and evaluate. The first step is to construct the right parametric model by studying the data domain and then iterate till the performance is achieved within acceptable level. Machine learning research is highly mathematical, but you can start by using some open source ML tools and tweaking the models to get a feel for the capacities of different models.

Some topics you should familiarize are: Probability Theory, EVD/SVD, ANN, ML/MAP estimation, Minimum classification error training, SVM, LMS fitting, PCA/ICA, FSM and HMM.


Thanks! Once I brushed up on the math, I'll be diving deeper onto those topics, which seemed to be math-intensive


I have not checked these out, but I know that there is a Hacker Dojo course that posts course material to the following site.

http://machinelearning101.pbworks.com/w/page/32890312/FrontP...

I have seen some of the lectures and notes posted on the following Stanford CS229 site. However, they will probably be hard to follow prior to learning some linear algebra.

http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a...


going to my bookmarks, and, yeah, got to get that linear algebra problem sorted out, could take me months


The Khan Academy site has linear algebra videos for free: http://www.khanacademy.org/#linear-algebra

Also, Octave is a free software package similar to Matlab that I imagine could be useful when learning linear algebra, to see instant results for problems that you are trying to solve without a computer.


MLDemos is a great free cross-platform GUI program that you can use to play around with various algorithms and visualize their effects: http://mldemos.epfl.ch/


This looks sleek! thanks!



bookmarked, Thanks!


For a broad(er) perspective on AI, checkout UC Berkeley's CS188 http://inst.eecs.berkeley.edu/~cs188. I think it's a good entry course into AI. One advantage is that the math requirement is not as high as Stanford's CS229 or EE263 (both of which are fantastic courses to be clear, but are easier to appreciate with the correct background).


Bookmarked! Thanks!

I know Machine Learning is a subset of AI, but lately I'm beginning to see it more of under Statistics. That's just my impression, I could really be wrong given my very limited knowledge.


Yep your intuition is good -- a lot of the mathematical techniques used in Machine Learning are motivated/derived/understood from Statistics. I'd recommend starting with a broader approach to this topic. In particular, there are often simpler heuristic approaches to a problem that are reasonably good and worth trying before trying to build a full-blown ML system. What I've learned is that knowing where ML systems fail/are overkill is just as important as knowing when to use them/build them.


If you have already brushed up on linear algebra and calculus (which you're gonna need if you want to do any serious ML) take a look at Hastie et al's Elements of Statistical Learning. The PDF is free, and the book is both extremely well written and super comprehensive. http://www-stat.stanford.edu/~tibs/ElemStatLearn/

You might also want to check out R, as its an amazing statistics language which has hundreds of packages available for ML. There's a large user community, and the really obscure error messages you get will teach you a lot about statistics. http://cran.r-project.org/

Also, a lot of machine learning is getting the data into a usable form, so learn how to use Unix command line tools such as sed, awk, grep et al. They are absolute lifesavers.


On the other issue, re the 'horrible at math', if you can manage, work through 'Project Euler'(google it), say one a day or 3-4 a week. If you can do more, great, but the aim is to get a long-term habit going. Get into the habit of doing a few of those problems a week and you will have a working capacity with 'real math', which is more than remembering formulas.


Yeah I already played around that site when learning Python. I actually got until (I think) problem #34. Then I sort of stopped, because I can't solved it without resorting to endless nested loops, which made me think I need to brush up on my mathematics because it is somehow limiting me when dealing with some problems. Kind of a chicken and egg situation as far as I could tell now that you mentioned it.


Wish I would finally follow my own advice, but I think it would be best to pick a problem and start hacking. There are lots of data sets available on the net, and competitions like the Netflix prize, too.

A while ago there was a story on HN about somebody who built a recommender system for boardgamegeek.com, for example. I thought that was inspiring.


I believe this to be sound advice, so what I'm actually thinking of right now is to go and pace slowly on the math/theory, while starting a super beginner project to get my feet wet.


MIT open course ware. Its like going to college, but free xD http://ocw.mit.edu/courses/electrical-engineering-and-comput...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: