I didn't imagine Julia had the maturity to have a book like this written about i...

idunning · on Feb 10, 2015

In my opinion, as a contributor to Julia and someone who teaches machine learning with R - start with R. Things will "just work" for the most part and you won't have to worry about whether your packages will work while you are learning ML. I recommend using the "caret" package in particular: it puts all the ML packages behind a nice common interface and has goodies like crossvalidation and train/test splits built in.

Python with Scikit-learn could be a good choice too from everything I hear (possibly even better, by some accounts).

To be clear, Julia is more than capable of doing ML, but I'd say that interface-wise its not quite there yet. Most of the pieces are there, everything from DataFrames to wrappers for GLMNet to random forests, and even the deep learning library Mocha.jl (check it out, its fantastic!). If you were to implement a new ML algorithm, I'd want to be doing it in Julia - it'll perform great without having to get in a multi-language scenario (like R+Rcpp or Python+???[numba?]).

shoyer · on Feb 10, 2015

Cython is usually the most best option for maintainable, high performance numerics with Python these days -- it's the option of choice for scikit-learn and pandas. Numba can be easier in some cases (no manual typing necessary) but is still pretty limited in some ways.

Lofkin · on Feb 10, 2015

Is numba poised to overcome these limitations at some point?

darksaints · on Feb 10, 2015

There was the one time I created a random forest in Julia because I was frustrated with the training time in R. It completed training in less than 1/20th the time that R did.

And then 2 weeks later, it wouldn't compile. Ah, the joys of the cutting edge.

IndianAstronaut · on Feb 11, 2015

I have run into several such issues with Python and R as well. Especially with CRAN packages which are often poorly written, black boxy, and don't have intuitive syntax.

zmjones · on Feb 11, 2015

how can an R package be "black boxy"? just look at the source. all the rf packages I am aware of (randomForest, party, and randomForestSRC) are well documented.

ForHackernews · on Feb 10, 2015

As it happens, I recently completed a machine learning class, and I used Julia for almost all of the exercises. The core language is outstanding for this use case; IMO, where it suffers is lack of third-party libraries. I wasn't able to find a visualization library I really liked, and for the more advanced work, I wound up using PyCall[0] to call out to Python in order to use Scikit-learn.

I'd say give it a shot. Julia is a really impressive language. I find it as easy and expressive as Python, but it's blazingly fast, offering near-native performance.

[0] https://github.com/stevengj/PyCall.jl

emilga · on Feb 10, 2015

I'm not in a position to rate Python vs. Julia, but the lecture notes were originally published using Python.

You can access the Python lectures from: quant-econ.net

It even has a comparison: http://quant-econ.net/python_or_julia.html

jskonhovd · on Feb 10, 2015

I took a machine learning course last spring, and I also had the choice of language to use. I thought about Julia, but I just felt I was going to be taking on to much in a short amount of time.

I ended up using Weka for parts of the class, and python using scikit-learn for the other parts.

However, I would say I am regretting not trying R. The other students in my course really liked it.