As a card carrying python-stack scientist who works at the intersection of machine learning and physical sciences for the last decade who now is working on an R package (pro-tip: go where the money is don't bring the money to you), can someone make a convincing argument for me to learn Julia? I would like to hear more than the typical "the code auto-differentiates" or "it's faster" or whatever it is that people have said in the past. I am really not trying to be flippant I just don't see the added value of learning a new language unless it has interesting packages/functionality that my current toolset does not (e.g., this is why I am working on an R package).
I'll put it this way: I'm just an idiot engineer, not a programmer really- but I've written some blazingly fast code in Julia that would have taken me way, way longer to write and resulted in way, way slower results in other languages I've played in.
I have to shoutout Chris Rackauckas for being a such badass, helpful person too. He'll probably be in this thread any minute because he's the best damn advocate for Julia there is. :-)
Not so sure about that last part. He's definitely an incredible force in terms of coding, project management, stuff like that. The Julia package ecosystem wouldn't be half of what it is without him.
But "best damn advocate" is probably the opposite of how I'd describe him in terms of interactions here (and generally with people outside the core Julia circle). He very often comes across as dismissive, overly defensive, and passive aggressive in comments here. All of that is dwarfed by his package contributions tbh, in terms of impact on Julia. But still, probably half of the negative perception about Julia community that people have, come from reading these interactions.
One reason to keep an eye on Julia if you are envisioning a long and varied career in the broader computational / data science world is that its not at all clear which will be the leading platform going forward (Python, R, Julia or something else altogether).
This ambivalence might sound absurd on the face of the exponential recent growth of Python (which apparently enticed even some people with serious mojo to get into the act) but take two steps back with me and look at the big (if still hazy) picture:
We are going through a remarkable period where complex algorithmic applications left academia and research labs and diffuse into mainstream society and the economy like never before. This process carries enormous risks and opportunities, which are currently basically... ignored (well, the risk side).
Despite its undeniable strengths and loveability, Python is actually a poster child of the move-fast-and-break-things phase. It is not necessarily best placed for the next phase. The next phase will invariably see a re-examination of all aspects of the stack and qualities that will be prized will be those that eliminate the frictions and risks associated with the large scale deployment of algorithms. The stakes are high, which means there will be plenty of resources seeking to create reliable platforms. The future need not look like the past.
None of the usual suspects ticks all the boxes. In fact we don't even know all the boxes yet. Depends how fast and how seriously models and algorithms get deployed at scale. Python, Julia and R have been propelled forward by circumstances as the main algorithm-centric platforms, and they have each their various warts and blessings but the near and mid-term future will test how well they can deliver on aspects they may have not be designed for.
> its not at all clear which will be the leading platform going forward (Python, R, Julia or something else altogether). This ambivalence might sound absurd on the face of the exponential recent growth of Python
It sounds absurd because trends don't reverse overnight. You can be fairly confident that Python will be the top language in this space for a while and that R will never be the top choice for most applications.
Irrespective of whether Julia ends up the winner of the shift or if the shift happens, it is quite possible for trends to reverse very fast. See Perl or Objective-C for that matter.
In 2006 (for Perl) and 2014 (for Objective-C) it was clear they had the momentum for their particular space however their limitations were well known and as soon as a better language came along the momentum flipped in an equally dramatic manner. Python is much more widespread so it will remain strong in some areas but you could see the flip in ML/DS given challenges productionizing across broad capabilities (not just doing NN's).
As the joke goes -- python is the second best language for everything, if you know only two languages. With ML expanding beyond narrow big tech domains there will be need for specialized languages like Julia (and others perhaps like Mojo etc..)
It was also boosted by Apple in the first place! so nothing natural about these kind of trends. If Google and FB hadn't picked up Python for ML it wouldn't have taken off as much, which is also to say if they (or another large player) back another language you could see a similar decline in Python usage.
I think so too in the short term (1-2 years at least) Python will gently move into the last stage of its adoption curve (even doing nothing).
But now is a time where at various high places people will say: "Ok you got my attention. What is this snake language you are talking about and explain why I should bet the house on it".
The corporate world will want to do $x because $x is in the news, they won't be making nuanced arguments about tradeoffs in a domain they don't understand. Least of all arguments that go entirely against the trends in the industry.
> society and the economy like never before. This process carries enormous risks and opportunities, which are currently basically... ignored (well, the risk side)
take the entire stack (including all dependencies, toolchains etc.) and think about scenarios of accidental or malicious malfunction, but also reproducibility, auditability of outcomes, that sort of stuff. The overall ability to provide locked-down, performant, safe, secure deployments of high-quality, validated algorithms without breaking the bank. In other words the risks (but also the frictions / costs) in the "productionising" of algorithms.
I do get where you are coming from. Indeed, it makes little sense to use Julia for lots of machine learning when PyTorch and Jax are just so good. And it sounds like you don't want to use Julia, so who am I to try and convince you? Python/R are capable languages.
But, there are still reasons I reach for Julia.
Interesting packages where I prefer Julia over Python/R: Turing.jl for Bayesian statistics; Agents.jl for agent-based modelling; DifferentialEquations.jl for ODE solving.
I would much rather data-munge tabular data in Julia (DataFrames.jl) than Python, though R is admittedly quite nice on this front.
Personally I reach for Julia when I want to use one of the previous packages, or something which I want to code up from scratch, where base Julia is much preferable to me than numpy.
Three reasons: Julia feels more like math, there's a huge long-term commitment to the language because it's used for climate modeling, and package management is completely painless.
I love Python, but I can also see eventually doing everything in Julia over the longer term. Mind you, it's entirely possible that AI continues to improve and in 5 years any package will be available in any language, you'll look at code mainly for verification purposes in whatever language you happen to prefer.
For me personally, I just think it's really fun to write julia code. Granted, I'm neither machine learning nor physical science, but the fact that I can go through the whole stack and choose an abstraction that's right for the problem at hand (Metaprogramming? Regular struct-based abstractions? External program? LLVM optimization? Inline assembly?) and still being able to understand what's going on while getting good performance at the same time, is just magical to me. Maybe that's not for everyone, but to me the ratio of dev time to run time is just really, really good.
I think the main idea behind Julia is to minimize the burden of doing the necessary but wasteful software engineering parts of scientific computation.
Let the language optimize more so you don't have to write a C++ library or figure out how to use it optimally. Don't waste as much time setting up your environment or worrying about platform compatibility. Don't worry about using multiple languages for different types of computation. And make the on-ramp fairly painless by being a convenient glue language.
It fills the gap of otherwise not having a managed and JITed language for general mathematical computation. If it's more burdensome for you to switch, then don't switch.
IMO, the biggest reason for me is that the code looks a lot more similar to the math than in python/R. This comes from a number of places (multiple dispatch, ability to use unicode symbols, you don't have to vectorize everything, etc), but the end result is code that looks a lot like the math you are trying to do (for examples, see https://discourse.julialang.org/t/from-papers-to-julia-code-...)
If you work partially in physical sciences, and TFA doesn't entice you to try Julia (someone with no GPU programming experience realize functionality to do serial, parallel cpu, parallel GPU navier-stokes, all without touching c, c++ or fortran - in mostly similar codesize/loc - achieving a 30x speedup) - i can't imagine what would?
If you're writing code that is fundamentally based on mathematical principles and models, even if you aren't personally using mathematics every day, its going to feel a lot better in Julia. That is: Julia looks a lot more like mathematics than Python.
__
Longer version:
Obviously some people are mostly writing websites or GUIs or whatever in Python and won't see the beauty in this.
But if the problems you are working on have, at their base, a mathematical foundation (even if you don't actively practice the math), it's much more beautiful IMO. So, simulation, data analysis/science and machine learning, statistics, etc...
Once you get used to using it for that though you'll realize it's actually quite nice for a lot of other things as well and the "mathematical mindset" it somewhat pushes results in cleaner solutions for other problems too. Just in general the syntax and patterns are nice.
Here are some quick things using randomness in Julia that would be a bit slower and more verbose in Python:
Generate a random number:
> rand()
Pick a random message:
> rand(["First message", "Hello", "Foo"])
Generate a random 3x3 matrix of booleans
> rand(Bool, (3,3))
Define a function and run it elementwise on a random matrix of bools:
> myprint(x)= x > 0 ? "Happy" : "Sad"
> B=and(Bool, (3,3))
> myprint.(B)
Returns:
> 3×3 Matrix{String}:
> "Happy" "Happy" "Sad"
> "Sad" "Happy" "Happy"
> "Sad" "Happy" "Happy"
And many many more nice features...but the Julia design meaning functions like rand() just apply how you expect regardless of the input type are quite nice. rand(list of stirngs) *should* give me a random string and rand(range of numbers) *should* give me a random number in that range! No one would write an academic paper and define a new rand function for each input because well...it's clear what the user wants - rand of something.
if you have to solve mathematical programming/convex optimization problems, JuMP as a frontend for free or commercial solvers is hugely better than any alternative.
likewise if you are solving differential equations, DifferentialEquations.jl is hugely better than any free alternative I know of and arguably better than paid packages. The broader SciML ecosystem that's built up around this has a lot of cool stuff in it too.
other than this it seems like you wouldn't care about the other potential advantages, and might be more put off than average by the disadvantages and occasional rough edges.
If you frequently want to develop and maintain publicly uses functionality that requires writing some in a faster compiled language and then binding to an interactive one like R or Python. Test coverage and multiuser maintenance is way easier when it's all just one language and has a sud package manager.
Context: Coming from a statistics background, I learned a bit of R, then a bit of Python for data analysis/science, then found Julia as the language I invested my time in. Over time I keep up with R and Python enough to know what's different since I learned them, but don't use them daily.
What I always tell people is the following:
If you are writing code using existing libraries then use whichever language has those languages. The NN stack(s) in Python are great, the statistical ML stack(s) in R are simple and include SOTA techniques.
If you are writing a package yourself, then I assume you know the core of the idea well enough to be able to write your code from the "top down" i.e. you're not experimenting with how to solve the problem at hand, you're implementing something concretely defined.
In this case, and tailored to your use, I would argue that Julia has more advantages than disadvantages, especially compared to R or Python. Here are a few comments:
1. Environments, dependencies, and distribution can all be handled by Pkg.jl, the built in package manager. There is no 3rd party tool involved, there is no disagreement in the community on which is better. This is my biggest pain point with Python.
2. Julia's type system both exists and is more powerful than that of Python (types or classes) and R (even Hadley's new S7(?) system). By powerful I mean generics/parametric types and overloading/dispatch built in. You can code without them, but certain problems are solved elegantly by them. Since working heavily with types in recent years, I find this to be my biggest pain point in R and I wouldn't want to write a package in R, although I like to use it as an end user.
3. New developments in scientific programming, programming ergonomics, hardware generic code (as in this post), and other cool features happen in Julia. New developments in statistics happen in R (and increasingly Julia), new developments funded by big companies happen in Python.
4. The Python and R interpreter start up faster than Julia. The biggest problem here is when you are redefining types, which is the only thing in Julia that can't currently be "hot reloaded" i.e. you need to restart Julia to redefine types.
5. Working with tabular data is (currently) far more ergonomic and effortless in R than Python and Julia.
6. Plotting is not a solved problem in Julia. Plots.jl is pretty easy and pretty powerful, Makie.jl is powerful but very manual. Time to first plot is longer than R or Python.
7. Julia has almost zero technical debt, R and Python have a lot. Backwards compatibility is guaranteed for Julia code written in >v1.0 and Pkg.jl handles package compatibility. If I send you code I wrote 4 years ago along with a Project.toml containing [compat] information then you could run the code with zero effort. (This is the theory, in practice Julia programmers are typically scientists first and coders second, ymmv.)
8. You can choose how low level you want your code to be. Prototyping can be done in Julia, rewriting to be faster can be done in Julia, production code can be done in Julia. Translating Python to C++ production might mean thinking about types for the first time in the dev process. In Julia, going to production just means making sure your code is type stable.
You can have nice foreign function interface between R->Julia and Julia ->R. If you're already happy pulling out slow functions into RCpp, then maybe there's no speed benefit. But there are some very nice, very fast libraries in Julia, where if you have a tight inner loop, it could be worth looking into
It reads and writes a lot like python (but nicer IMO), I don't think the learning curve is immense to try it for small optimizations. And it's also not unreadable so other people can verify your code
I mean i never would have signed up to develop an R-package on my own. But I was at the right place at the right time to work on a project with funding that is interesting and it just happens to be in R. It's nice to learn a new tool (no matter what it is), but I would not have chosen R if it was my choice.
Moreover, I think sometimes people get their PhD and think they deserve to use the tools they put in their toolbox, on the problems they focused on, and don't see that all they really did was get a ticket to the game. Most scientists have a phd. Most scientists don't work on the thing their PhD is about ten years later. The sooner you open up to that the sooner you will get out of the postdoc chase and get a job that is a lot more rewarding (both intellectually and financially). All this means that there may be problems you are going to learn about and focus on that you never thought you would at some point, and being open to that and seeing it as an opportunity will carry you further then not.
I think it's fine if you don't learn Julia. When I was in university some of the course work had to be done in MATLAB. I think Julia could definitely be used instead nowadays. Simply being free is reason enough. You could argue that python/numpy would be an option as well.