Julia: A Fast Language for Numerical Computing

samuell · on March 12, 2016

If Julia had Go-style concurrency primitives (channels and lightweight threads), and M:N thread multiplexing, it would be the perfect language to implement my upcoming data flow based scientific workflow language (http://github.com/samuell/scipipe).

Now I'm instead trapped with Go, which lacks a REPL, and leaves a lot to wish in terms of metaprogramming capabilities needed to create a nice programmatic API.

The lack of the mentioned features is my biggest concern with Julia.

StefanKarpinski · on March 12, 2016

Threading is already an experimental feature and Go-style concurrency will be a standard feature in the future. Well, technically it may be more like Cilk or TBB but it will be very similar to Go.

kkylin · on March 13, 2016

Cilk-style concurrency would be a really nice addition to Julia.

samuell · on March 13, 2016

Interesting.

elcritch · on March 12, 2016

I implemented a test of a micro-benchmark posted on HN several weeks back. It took some extra tries but eventually the performance using the existing 'Task' feature was acceptable. Not Go level but decent and give the rate of development on Julia it'd probably be worthwhile to experiment.

samuell · on March 15, 2016

That is interesting to know. But, so the Tasks (which is implemented with light-weight threads IIUC) will all run in the same os process/thread, unless you manually create new ones?

agentgt · on March 13, 2016

I don't have knowledge of the "Common Workflow Language" but I am not sure Go's concurrency is really that much of a selling point for composition. Particularly for composition of evaluation since I would imagine how Go does concurrency would bleed through into your implementation code (probably not what you want.. or maybe it is... or maybe it doesn't for you library?). That is I'm not sure Go is really good at composition compared to functional programming languages.

For example one could use a monadic style data structure such .NET's reactive Observable (which has an analog in many different languages) and allow you to compose a stream like definition independent of how it runs. You can then feed this definition (Observable) to an evaluator which could run it on Go using channels or it could run it on Go using a cluster of machines with ZeroMQ.

I think the selling point to your library is that it compiles to native code but my question is with out changing code can I switch to running it on a cluster instead of local (something I could do with a language with an abstract form of concurrency using monads or Observables or even just using the Actor model)? It looks like I can right?

samuell · on March 13, 2016

The idea for cluster support has so far been to implement connectors for resource managers like SLURM, and basically keeping scipipe as an orchestrator.

That is, for multi-node jobs. As long as you can stay within the 16-32 or so cores on a typical HPC node (in our cluster at least), scipipe should be great for that. I think some means of simple resource management (to not overstress these 16-32 cores) is needed, but that can be done in a simple way by e.g. using a central go-routine that lends out "the right to use a cpu core" on demand.

Thanks for interesting feedback. I will think about this!

pjmlp · on March 13, 2016

You don't need language specific primitives, when a language is feature rich enough, that those features can be done in a library.

samuell · on March 13, 2016

Yep, but this adds complexity and fear of maintenance problems, depending on how "close to the core" the library is.

samuell · on March 15, 2016

For example, I looked into implementing this with Python 3.5's coroutines and async/await syntax, but this seems to add an enourmous amount of complexity. For example you need specialized versions of many of the standard library methods, just to make them usable in the async setting.

In either case I couldn't get my head around how to implement this.

In Go, the implementation is conceptually extremely simple (although the code might not always be the most readable).

tanlermin · on March 15, 2016

Have you tried dask? It already has dataflow programming built in for numpy arrays, out of core and custom structures.

http://dask.pydata.org/en/latest/

samuell · on March 16, 2016

Interesting. Do you have a link to description of their "dataflow" implementation or API?

From what I can see in the docs I'm getting afraid Dask does the same mistake as so many other recent tools: Allow only task dependencies, while what is needed in general scientific workflows is data dependencies (connect outports and inports of process). I have explained this difference in a blog post earlier: http://bionics.it/posts/workflows-dataflow-not-task-deps

(UPDATE: in all fairness, they seem to be doing something in-between, a little like Snakemake, in that they allow to specify data inputs based on a naming scheme. What we want is a totally naming scheme independent declarative way of explicitly connecting one output to one input, as that is the most generic and pluggable way you could do it.)

If they allow true data dependencies though, that would be very interesting.

tanlermin · on March 16, 2016

Very interesting I didn't realize the difference.

How about this? https://github.com/shashi/ComputeFramework.jl

But I suspect it has the same problem.

Edit: There is also this https://github.com/JuliaDB/DataStreams.jl

tanlermin · on March 17, 2016

Okay, check this out: DataFlow programming for Julia https://github.com/MikeInnes/Flow.jl

wyager · on March 12, 2016

Sounds like Haskell may be appropriate for your use case. It has green threads, a great REPL, and extremely powerful polymorphism (which provides a safe and clean alternative to e.g. macro/template-based metaprogramming, although Haskell has that too if you need it).

samuell · on March 13, 2016

Interesting, didn't know about green threads in haskell (I'm not too familiar with it over all).

SixSigma · on March 13, 2016

You may be interested to read this section on Julia's parallel computing facilities.

http://docs.julialang.org/en/release-0.4/manual/parallel-com...

Which to me as Limbo programmer, look like channels, even across machines.

freyr · on March 13, 2016

Go is really great for "workflow," but I find it completely lacking for "scientific." Have you attempted this latter part of the project yet? I don't see many examples in your project that attempt scientific or numerical operations.

samuell · on March 13, 2016

The thinking is to use external tools for the main scientific parts - hence the big focus on shell support.

This is common practice in bioinformatics already, because of the large plethora of tools which would be too much to rewrite for any particular language.

Then, Go will probably be OK for more mundane tasks such as data pre-processing, filtering, etc etc.

Otherwise, there are in fact some scientific Go libraries already, including BioGo [1] and GoChem [2].

[1] https://github.com/biogo/biogo

[2] http://gochem.org

erikpukinskis · on March 13, 2016

What kind of API do you need that requires metaprogramming?

samuell · on March 13, 2016

For example I'd be happy if I could generate structs dynamically, based on string input.

This would mean that we could automatically create true struct-based components with channel fields from the shell like syntax used in the examples in the README (like "echo > {o:foo}" to write to an out-port named "foo"), so that connecting an out-port to an in-port would go like:

Process2.InFoo = Process1.OutFoo

This is not possible with Go's reflection though, so right now, based on these shell like patterns, we can only populate maps (InPorts and OutPorts) of the process, such that the above code example becomes:

Process2.InPorts["foo"] = Process1.OutPorts["foo"]

erikpukinskis · on March 14, 2016

> For example I'd be happy if I could generate structs dynamically, based on string input.

I don't understand why you'd want to do that. That sounds like an architecture you are excited about, not a problem you are trying to solve. Can you give me some context?

samuell · on March 15, 2016

The reason is a practical one: Struct fields will show up in auto-completion. This is in our experience surprisingly important when doing iterative workflow development, to lower the amount of silly typo errors, which can waste a lot of cluster compute hours etc.

sbinet · on March 14, 2016

generating structs at runtime might land for go-1.7: https://go-review.googlesource.com/#/c/9251/4

samuell · on March 15, 2016

That would be awesome.

_vya7 · on March 12, 2016

See also, Julia's home page: https://julialang.org/

I toyed around with it for a while, and really liked it. I wondered to myself "what makes this language specific to numerical computing? it seems generic enough to be used wherever I'd want to use Ruby or Haskell..."

But then I tried using it. And even though the language is generic enough, the standard library just plain assumes you're only doing numerical computing, and makes everything else needlessly hard by trivial omissions.

That was like 2013 though, so maybe things are different now?

noam87 · on March 12, 2016

Interesting, I've had the opposite reaction. I find it to be about the nicest programming language to work with. I even started writing a little toy database in Julia, and find it really easy to reason about.

Package management system is its only weak point IMO. It makes a lot of counter-intuitive decisions (having to build a package from the packages directory? that's just strange). Cargo / Bundler to me feel like the gold standard in package management... most alternatives I've seen would be better off just copying those workflows unless there's a really good reason to try something new.

jernfrost · on March 13, 2016

I haven't used Julia for any big projects but I routinely use it for small projects or utilities. I have used it for: 1) Obfuscating source code 2) Reading and presenting iOS provisioning profile data 3) Simple social science related simulations ported from Ruby or Python. 4) Code completion plugin for TextMate etc

So no huge projects, but I found Julia to be a very nice and versatile language to work in. I find I can write code which is clearer than its corresponding Python and Ruby version and which has less bugs because type info is a bit more explicit, but without getting too much in the way as is typical with statically typed languages.

I tried Haskell for some of the same projects, but I got to say Haskell takes way longer time to get proficient in. For the stuff I did I also found the way the Julia type system worked, made code reuse much easier than Haskell.

E.g. with the Julia Union type it was possible to define individual function across multiple specific types. E.g. I could make a particular implementation of f apply to type A, B and C, while another particular implementation of f applies to D and E. Function g could have a specific implementation of A and E. While another implementation of g is used for B and C.

KenoFischer · on March 12, 2016

While numerical computing is the first target, that's definitely not the sole goal of the language, so if there's something simple that you feel is missing, I would be very curious to know what that is.

gravypod · on March 12, 2016

I was really REALLY interested in the language due to the clustering ability and their JIT they build into it. But it does seem like the language was build without any systems-usage in mind. How sad. Nothing will ever quench my thirst for clustering.

vegabook · on March 12, 2016

so... it was built for clustering but it doesn't quench your thirst for clustering? Please could you clarify.

rspeer · on March 12, 2016

The meaning was clear to me. It doesn't quench his thirst because he has chosen not to use it.

gravypod · on March 12, 2016

Yes. It is that I cannot use it for things that I'd like.

skybrian · on March 12, 2016

It seems to be offline.

KenoFischer · on March 12, 2016

HTTP site is up. It's hosted on Github Pages so HTTPS may not work.

kafkaesq · on March 12, 2016

BTW is it just me, or do other people find that the interpreter is very slow to start under older versions of OS X?

That's one of my current obstacles to diving further into Julia.

KenoFischer · on March 12, 2016

It used to be a big problem. It's still not super fast (~1s or so), but we've been doing quite a bit of working at making it faster. If it's significantly slower than that, something else must be going on.

kafkaesq · on March 12, 2016

Thanks -- I'll definitely be giving it another shot, at some point.

SixSigma · on March 13, 2016

I recently started using Julia as my main platform as a replacement for Octave in my engineering classes.

I have found it an excellent language, as easy as Python to pick up and some great numerical primitives.

Using Juliabox in class means no installation woes and means whatever I do during lessons is available wherever I can get online. The ability to mix in markdown and code on the same page is brilliant for keeping notes. I realise that is not a feature of Julia but still, it is a benefit.

I've not done any heavy work in it but I'm really looking forward to doing so. I can dump Python, Octave and Limbo all at the same time.

irremediable · on March 12, 2016

What's the best way to run a debugger through Julia code? I value an interactive debugger much more for this kind of nitty gritty numerical programming than I do for lots of other tasks.

ced · on March 13, 2016

Work on the debugger is underway, it should be in the 0.5 release in a few months.

bitL · on March 12, 2016

Just a question - I completely lost interest in Julia when I read it adopts many of the ugly language hacks of Matlab - why should I care about Julia when I can feel frustrated with Matlab already?

CyberDildonics · on March 12, 2016

It was made to be specifically better than matlab and was made in part by people frustrated with matlab. Have you taken a look at the language yourself?

kafkaesq · on March 12, 2016

Such as? I'd be curious.

rspeer · on March 12, 2016

The literal syntax that you'd think would make an array of arrays concatenates the arrays instead. Sometimes. It depends on types. (Or have they fixed that now?)

Also, although this is of course a preference, 1-based indexing. I would rather have compatibility with C and Python and I don't care one bit about compatibility with Matlab.

These decisions make it seem that Julia just aspires to be "open source Matlab" when that already exists. It could instead aspire to be the good programming language that Matlab isn't.

KenoFischer · on March 12, 2016

> The literal syntax that you'd think would make an array of arrays concatenates the arrays instead. Sometimes. It depends on types. (Or have they fixed that now?)

Yup, we fixed that (deprecated in the latest release, will be switched over). It was super annoying.

There's been lots said on the 1-based indexing discussion, so I won't get into it here, but I can guarantee you that "open source Matlab" is not the ambition. Our ambition is to be the best possible programming language for technical computing and a great language for programming overall.

kafkaesq · on March 14, 2016

There's been lots said on the 1-based indexing discussion, so I won't get into it here.

It's not a deal-breaker, I suppose. Still, the rationale I've heard -- "We have chosen 1 to be more similar to existing math software"[1] (as opposed to basically every modern programming language out there) just seems... weird. Perhaps "compatibility with traditional sequence notation in math and physics" (upon which presumably most software packages are based) would be a better statement of the case for 1-based indexing.

[1] https://github.com/JuliaLang/julia/issues/558

CyberDildonics · on March 12, 2016

Julia is actually a very well designed programming language. 1 based indexing is really very insignificant when it comes to all the difficult aspects of writing good software that it becomes a non-issue.

chrispeel · on March 13, 2016

> Also, although this is of course a preference, 1-based indexing.

Advanced Julians actually go for 2-based indexing [1].

See [2] for more on this...

[1] https://github.com/simonster/TwoBasedIndexing.jl [2] https://groups.google.com/forum/?hl=en#!topic/julia-dev/tNN7...

goerz · on March 13, 2016

I think of it more of a compatibility with Fortran, which is a definite plus to me.

knlje · on March 12, 2016

Have they already fixed the need to "devectorize" the code to make it fast? This was the major drawback for me since I wish the code to be concise AND fast.

chrispeel · on March 13, 2016

I agree, this would be good. Intel's ParallelAccelerator.jl [1] is making some progress in this area.

[1] http://julialang.org/blog/2016/03/parallelaccelerator

elcritch · on March 12, 2016

Found this to be pretty handy at times: [Devec](https://github.com/lindahua/Devectorize.jl).

KKKKkkkk1 · on March 12, 2016

Does Prof. Edelman have a financial stake in Julia? He is such an enthusiastic booster of the language, that either it must be truly amazing, or there is an undisclosed motive.

KenoFischer · on March 13, 2016

Alan is one of the people who started the language (which is mentioned in the article). He is also one of the cofounders of Julia Computing (the company we formed to provide support and other services for the company). So from my biased viewpoint, the language is both truly amazing and Alan has a motive to promote the language, if only since he has a significant chunk of time and effort into making it a reality. I don't think it's fair to call it undisclosed however.

KKKKkkkk1 · on March 13, 2016

I think it's great that Prof. Edelman is dedicating his time to Julia and I hope he will be rewarded for it. However, I think that a statement of the sort you provided above should have appeared below the SIAM News article. This is something that would have been done in any major newspaper or scientific journal.

KenoFischer · on March 13, 2016

Yes, I don't disagree. I don't know why the SIAM news editors didn't. I do hope my statement here makes clarifies the facts at least for people coming from here.

vonnik · on March 12, 2016

This is super. Fwiw, Java and Scala aren't bad for numerical computing either: http://nd4j.org/ https://github.com/deeplearning4j/nd4j/

ubasu · on March 12, 2016

You seem to be conflating "numerical computing" with machine learning. However, numerical computing typically involves solving PDEs via e.g. finite elements or finite differences, or solving large systems of linear equations associated with such methods.

The difference between the two is that when solving PDEs, accuracy is paramount, so even using single precision is a bit of a compromise, whereas in machine learning, the trend seems to be to use half-precision or lower, sacrificing accuracy for speed.

For classical numerical computing, e.g. solving PDEs or linear equations, Java may not be the best choice, e.g. see the following paper:

How Java's Floating Point Hurts Everyone Everywhere:

https://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf

agibsonccc · on March 12, 2016

FWIW, we use blas just like matlab,numpy,r, and julia.

The speed depends on the blas implementation.

Nd4j has a concept of "backends". Nd4j backends allow us to sub in different blas implementations as well as different ways of doing operations.

We have a data buffer type that allows people to specify floating point or double. Those data buffers then have an allocation type that can be javacpp pointers,nio byte buffers (direct/offheap) or normal arrays

We are also currently working on surprassing the jvm's memory limits ourselves via javacpp's pointers. That allows us to have 64 bit addressing which people normally have access to in c++.

Every current jvm matrix lib that uses net lib java or jblas is going to have problems with jvm communications as well. The reason for this is passing around java arrays and byte buffers is slower than how we handle it which is via passing longs (raw pointer addresses) around that are addressed via unsafe or allocated in jni where we retain the pointer address directly. We expose that to our native operators.

We are solving this by writing our own c++ backend called libnd4j that supports cuda as well as normal openmp optimized for loops for computation.

We also offer a unified interface to cublas and cblas (which is implemented by openblas as well as mkl) FWIW, I more or less agree with you, but it doesn't mean it shouldn't exist.

JVM based environments can bypass the jvm just like python does now.

The fundamental problem with the jvm is no one just took what works on other platforms and mapped the concepts 1 to 1.

Our idea with nd4j is to not only allow people to write their own backends, but also provide a sane default platform for numerical computing on the jvm. Things like garbage collection shouldn't be a hindrance for what is otherwise a great platform for bigger workloads (hadoop,spark,kafka,..)

In summary, we know the jvm has been bad till now - it's our hope for fixing that.

tjl · on March 13, 2016

Looking at it, I don't see anything dealing with sparse matrices or factorization (e.g., LU, QR, SVD). All the Java libraries for SVD are pretty bad. Plus, none of your examples mention double precision. Does the library support it?

I find it interesting in the Numpy comparison no mention of the BLAS Numpy is linked to is mentioned, but it is for Nd4j. Numpy is highly dependent on a good BLAS and the basic Netlib one isn't that great.

agibsonccc · on March 13, 2016

Those are implemented by lapack as part of an nd4j backend.

Yes we have double precision - we have a default data type with the data buffer.

If you're curious how we do storage: https://github.com/deeplearning4j/nd4j/blob/master/nd4j-buff...

We have allocation types and data types.

Data types are double/float/int (int is mainly for storage)

Allocation types are the storage medium which can be arrays,byte buffers or what have you.

If you have a problem with the docs - I highly suggest filing an issue on our site: https://github.com/deeplearning4j/nd4j/issues

We actually appreciate eedback like this thank you.

For net lib java, it links against any blas implementation you give it. It has this idea of a JNILoader which can dynamically link against the fallback blas (which you mentioned)

or typically openblas or mkl. The problem there can actually be licensing though. The spark project runs in to this: https://issues.apache.org/jira/browse/SPARK-4816

If we don't mention on the site, it's probably because we haven't thought about it or haven't gotten enough feedback on something.

Unfortunately, we're still in heavy development mode.

FWIW, we have one of the most active gitter channels out there. You can come find me anytime if you're interested in getting involved.

tavert · on March 13, 2016

Lapack doesn't implement any sparse linear algebra. If you think the landscape of "Java matrix libraries" is fragmented, when really they're all just different takes on wrapping Blas and Lapack or writing equivalent functionality in pure Java, wait until you look into sparse linear algebra libraries. There's no standard API, there are 3ish common and a dozen less common different storage formats, only one or two of these libraries have any public version control or issue tracker whatsoever, licenses are all over the map. The whole field is a software engineering disaster, and yet it's functionality you just can't get anywhere else.

agibsonccc · on March 13, 2016

I'm aware of the different storage formats. However there are quite a few sparse blas and lapack implementations now.

I'm aware the software engineering logistics that go into doing sparse right which is why I held off

We are mainly targeting deep learning with this but sparse is becoming important enough for us to add it.

As for disparate standards I managed to work past that for cublas/blas.

I'm not going to let it stop me from doing it right. If you want to help us fix it we are hiring ;).

tavert · on March 13, 2016

> However there are quite a few sparse blas and lapack implementations now.

There's the NIST sparse blas, and MKL has a similar but not exactly compatible version. These never really took off in adoption (MKL's widely used of course, but I'd wager these particular functions are not). What sparse lapack are you talking about?

> If you want to help us fix it we are hiring ;).

We were at the same dinner a couple weeks ago actually. I'm enjoying where I am using Julia and LLVM, not sure if you could pay me enough to make me want to work on the JVM.

namelezz · on March 12, 2016

How is nd4j compared to Breeze and jblas, which are used in Spark MLlib[1]?

[1] - http://spark.apache.org/docs/latest/mllib-data-types.html

agibsonccc · on March 12, 2016

See my post here: https://news.ycombinator.com/edit?id=11275071

We have this concept of a backend where we had net lib java and jblas, but both are horribly slow/limited and need some updating which is why we added our approach to this.

If a new matrix framework comes out, I will just write a backend for it and allow people to keep the same dsl.

Breeze also isn't usable from java: http://stackoverflow.com/questions/27246348/using-breeze-fro...

FWIW, we also support both row and column major (you can specify the data as well as ordering) very similar to numpy.

If you come use nd4j, it will mainly be for a dsl that encourages vectorization just like any other numerical language.

vonnik · on March 12, 2016

ND4J supports n-dimensional arrays, while Breeze does not. jblas is great but it's not fast enough. We've relied on netlibblas in the past, and we're moving to faster computation libs now. We'd love for Spark MLlib to plug into ND4J at some point.

noobermin · on March 12, 2016

Java certainly has obtained a bad reputation in the sciences, however.

agibsonccc · on March 12, 2016

I agree. A lot of that is because no one has taken the lessons hard learned from every other numerical language and just replicated it (hardware acceleration,BLAS,ndarrays,..)

The JVM has a lot of matrix library fragmentation.

vonnik · on March 12, 2016

Just curious: Why is that?

tavert · on March 13, 2016

Without easy access to SIMD or value types (and JNI does not count as easy), the JVM's performance for regular computations on large datasets is still pretty lacking. Quoting http://cr.openjdk.java.net/~jrose/values/values-0.html, "Numeric types like complex numbers, extended-precision or unsigned integers, and decimal types are widely useful but can only be approximated (to the detriment of type safety and/or performance) by primitives or object classes."

And the JVM's costs in terms of deployment and startup times aren't very attractive for scientific workloads. You wouldn't gain as much in a migration from Python or Matlab to Java (or any other JVM language) as you would to C++ (or increasingly Julia), since you'll likely still need access to the same native libraries and JNI is kind of a pain.

elcritch · on March 12, 2016

Having done both Java and scientific programming I'd mostly say that the verbosity of Java really distracts away from the task of writing scientific code. An academic really doesn't usually care to learn about singletons and classes to write really fast for-loops. Also while the speed of Java now is awesome, floating point performance is key for those scientists who do know more systems level programming (memory management is a big issues). Actually GC in Julia can be a pain also.

noobermin · on March 13, 2016

I have a passing knowledge of Julia (I've written a simple Lorentz Force[0] stepper in it) is there a way to avoid GC in the middle of the calculation, ie., a performance critical loop or other?

[0] https://github.com/noobermin/jelo

ffriend · on March 13, 2016

You can turn garbage collection on and off using `gc_enable` function.

agibsonccc · on March 12, 2016

Which is why a c++ underbelly is so important. The computations shouldn't happen at the java level if a scientific lib is being done right. I'd say scala isn't a bad alternative to that (which we have nd4s which is a wrapper over nd4j)

tanlermin · on March 13, 2016

Ugh.

But in julia, its written in julia. There are even julia native tensor and matrix ops.