Hacker News new | past | comments | ask | show | jobs | submit login

I didn't imagine Julia had the maturity to have a book like this written about it. Dives in

Does anyone have anything to say about Julia's benefits vs R or even Python (SciPy, Numpy, etc)? I'm in a machine learning course this semester and we have a choice of language and I'm wondering if its worth it to try and use Julia rather than Python since its so hip. (Just kidding about it being hip, but it would be interesting to learn something with increasing developer support).




In my opinion, as a contributor to Julia and someone who teaches machine learning with R - start with R. Things will "just work" for the most part and you won't have to worry about whether your packages will work while you are learning ML. I recommend using the "caret" package in particular: it puts all the ML packages behind a nice common interface and has goodies like crossvalidation and train/test splits built in.

Python with Scikit-learn could be a good choice too from everything I hear (possibly even better, by some accounts).

To be clear, Julia is more than capable of doing ML, but I'd say that interface-wise its not quite there yet. Most of the pieces are there, everything from DataFrames to wrappers for GLMNet to random forests, and even the deep learning library Mocha.jl (check it out, its fantastic!). If you were to implement a new ML algorithm, I'd want to be doing it in Julia - it'll perform great without having to get in a multi-language scenario (like R+Rcpp or Python+???[numba?]).


Cython is usually the most best option for maintainable, high performance numerics with Python these days -- it's the option of choice for scikit-learn and pandas. Numba can be easier in some cases (no manual typing necessary) but is still pretty limited in some ways.


Is numba poised to overcome these limitations at some point?


There was the one time I created a random forest in Julia because I was frustrated with the training time in R. It completed training in less than 1/20th the time that R did.

And then 2 weeks later, it wouldn't compile. Ah, the joys of the cutting edge.


I have run into several such issues with Python and R as well. Especially with CRAN packages which are often poorly written, black boxy, and don't have intuitive syntax.


how can an R package be "black boxy"? just look at the source. all the rf packages I am aware of (randomForest, party, and randomForestSRC) are well documented.


As it happens, I recently completed a machine learning class, and I used Julia for almost all of the exercises. The core language is outstanding for this use case; IMO, where it suffers is lack of third-party libraries. I wasn't able to find a visualization library I really liked, and for the more advanced work, I wound up using PyCall[0] to call out to Python in order to use Scikit-learn.

I'd say give it a shot. Julia is a really impressive language. I find it as easy and expressive as Python, but it's blazingly fast, offering near-native performance.

[0] https://github.com/stevengj/PyCall.jl


I'm not in a position to rate Python vs. Julia, but the lecture notes were originally published using Python.

You can access the Python lectures from: quant-econ.net

It even has a comparison: http://quant-econ.net/python_or_julia.html


I took a machine learning course last spring, and I also had the choice of language to use. I thought about Julia, but I just felt I was going to be taking on to much in a short amount of time.

I ended up using Weka for parts of the class, and python using scikit-learn for the other parts.

However, I would say I am regretting not trying R. The other students in my course really liked it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: