This is going to be extremely useful! I just recently settled on OCaml as my language of choice for most projects, after working with dozens of others, from C to Haskell.
I was actually looking for a good numerical library, supporting complex numbers, matrix operations, and multi-dimensional arrays. I'm looking forward to using and contributing to this one!
Here are the factors that made OCaml my language of choice, after thousands of hours spent learning and using other languages.
Strong typing with type inference makes it quick and easy to write code (no verbose declarations) that is safe to execute, meaning that the compiler catches 99% of bugs that you usually make, which other languages let through.
Algebraic data types and also a more flexible (if a bit more dangerous) kind of typing that is still type-safe and compiler-checked, but dynamic, called polymorphic variants. These features make it a breeze to define a solid foundation of types for you to write your algorithms on.
Pure functional programming (no mutation) as the default way to write code, with a lightweight and readable syntax. This is a plus because it's the most bug-free way to write code.
Classical imperative programming (loops, I/O, variable mutation) when you need it or when you find it useful for a given task (simplest example: inserting debug prints halfway through a function) without resorting to mind-bending constructs that hide what's really going on behind a facade of purity (Monads and the horrid monad stacks of Haskell, I'm looking at you.)
Compiles to machine code so no useless speed limit (as found in interpreted languages) or heavy startup time and memory footprint (as in JIT / Java)
Predictable complexity (CPU) and memory usage. This is true for most languages, except for Haskell.
Sane exception support. This is also true for most languages, except for Go.
IMHO the poor multicore parallelism issue is way overblown. Yes, most CPUs nowadays have many cores. But if you are writing server/backend code, being limited to a single-threaded model is usually not a problem, because your server will be handling tens or hundreds of simultaneous requests anyway, and handling one at a time on a single core is better for overall throughput. OTOH, if you are writing heavy numerical single-application code, you are better off handling the parallelism by hand, that is, choosing what gets palallelized and how, which library to use, and so on. We don't have hundred-core CPUs yet, so it's a bit early to worry about it. Besides, they are working on it.
PS: I recommend the book https://realworldocaml.org/ (and the Core standard library) for anybody starting with the language.
Not OP but also OCaml user here, I consider Haskell's laziness to be the wrong default and think it makes it harder to reason about programs, and I prefer the 'mostly pure' approach of ML languages to strict purity.
For me OCaml or the ML family hit the sweet spot of language design.
I will agree that the parallelism issue can be a problem.
Please tell me about the parallelism problem? I love that Haskell can auto-parallelize and is very safe. But, I'm ready to kick it to the curb over stupid lazyiness-induced memory explosions. But I'd be sad to go back to having to worry about threads and safety.
I'd love to give OCaml a go. Tell me why I'll end up disgruntled.
There is no parallelism, the runtime doesn't offer any. The best you can do is fork(). But there is an effort ongoing for a few years now, multicore-ocaml, which will be available mainstream when it's Ready(no ETA).
Slightly off topic sorry, but how Algorithm Differentiation usually implemented? Do you just apply the chain rule to an AST, or is there something more complex you have to do?
Nevertheless, for simple computations this specific algebra indeed boils down to "applying the chain rule to the AST" (as well as the diff rules for all primitive functions). In forward mode, the only difference to symbolic differentiation is that common terms are reused (items in the AST, treating the AST as DAG), so the expression terms do not explode in size, and the computation is very efficient.
In reverse mode (and all other, more complicated modes), the chain rule still plays a central role, but is applied in a different way. Here, saying the "chain rule is applied to the AST" might still be technically correct, but would be very misleading, as the whole control flow is different, not to mention memory usage.
I would add that (forward) Algorithmic Differentiation is also very well-suited for differentiating algorithms (lol, you wouldn't say!) and not just pure mathematical formulas.
For example, suppose you have an algorithm with loops, conditionals, recursion, sub-functions calls, even using global variables and side effects: if you use the overloaded operators for all numerical paths in your algorithm, AD can differentiate it in a natural way, just by executing it on the dual number system (AFAIK)
This is true in principle, but only if the resulting function is still differentiable. Each loop or conditional can make your function non-differentiable, perhaps in subtle ways. Of course, this can also happen in "pure formulas" if you use abs(), sgn() and friends, and becomes even more nasty with floor(), ceil() and friends. But it occurs more commonly in code due to if(), for() and while().
Having said that, most of the time your resulting function is at least piecewise differentiable, which is annoying and needs to be taken care of, but is not a show stopper.
Note that piecewise linearization is still an active research topic, e.g.
Yes, "algorithmic differentiation" is the modern term for what was formerly called "automatic differentiation".
> Is this native support (as opposed to there being a library for it)?
I didn't look into the details, but it seems to be essentially the "operator overloading" approach, not the "source transformation" approach. However, given the optimizing compiler and OCaml's very good module type system, the result might be the same, at least for the forward mode.
when did that happen? i have never heard AD referred to as algorithmic differentiation. seems weird to me as automatic is already a good name. algorithmic, to me, conjures up ideas more related to numerical differentiation. there are algorithms in the implementations but the concept really is automatic.
This happened a long time ago, at least 9 years ago.
Source: The mentors of my math diploma thesis were Prof. Andreas Griewank and Prof. Andrea Walther, who both happen to be the authors of the kind-of standard book on AD: "Evaluating Derivatives" http://epubs.siam.org/doi/book/10.1137/1.9780898717761
The second edition of that book is from 2008 and uses (and prefers) the term "algorithmic differentiation".
> algorithmic, to me, conjures up ideas more related to numerical differentiation. there are algorithms in the implementations but the concept really is automatic.
I beg to differ.
I like the term "algorithmic differentiation" because it describes what is differentiated (algorithms rather than plain formulas). This is better than saying how it is differentiated, as symbolic differentiation is also "automatic" in the sense that any computer algebra system performs that task automatically.
The novelity of AD is not that it is automatic, but that it operates on code rather than formulas. The techniques of AD are still valuable even if you transform your code by hand (purely mechanically, using the rules of AD). This is especially true for the reverse mode, which is a great recipe on how to write your gradient calculation code in an optimal way. This is easier and less prone to errors than, say, "naive" differentiation and optimizing your code afterwards by hand.
I find this design clean and appealing. Compared to the current fashion of all these Python libraries, it's a breath of fresh air. But I wonder if it can work. Python has going for it: kitchen sink of tools, easy to learn, tooling that took 20 years to develop, you already know it.
Question: What's wrong with Matlab? Is it that it's closed and costs money? Other than that it seems clean, purpose-built, complete and not really expensive. Why all the open-source attempts to replace it?
I really thought Julia was going to do that, but it's been years and they never got the tooling. They never solved the ridiculous, self-inflicted problem of having to recompile the entire system every time you run it. You're code will execute in 300ms, but only after 10 minutes of compiling. (Or did they make progress? I haven't checked in in a while.)
Julia's improved a lot on the issue you're mentioning, so you might want to give it a try again. I don't notice it anymore, although I used to. I do think Julia's performance, while pretty impressive, isn't quite what it's been billed as. How it compares to other systems in that regard, I don't know.
As for Matlab, I think the big turnoff for me is its closed-source nature, which is maybe a more serious problem than you're giving it credit for. I also think that things like Python, that are more general-purpose, integrate much better than Matlab into other systems (e.g., web systems--you might not think it would matter if your estimation program doesn't integrate into a web app easily, but I didn't either and now I'm very sensitive to it). This is partially due to the open-source nature of things like Python, which lends itself to competition and experimentation, but also it being a more general-purpose language (this latter point is another sticking point with Julia, although wouldn't apply to OCaml).
This OCaml library is really beautiful, actually. I looked into OCaml several years ago for this sort of thing, and if it had been around then, I might have invested more in it at the time, and kept using it. I'll have to revisit it.
I think the big turnoff for me is its closed-source nature
Well, the issue of open-vs-closed is less clear with things like MATLAB. Sure you can't download it for free-as-in-beer, but if you own a copy, you are free to inspect its source code, as MATLAB is mainly written in MATLAB (same for Mathematica and other similar packages).
A subset of non-performance-critical functions, and things like argument validation layers, are written in m code that you can read. Most of the actual work happens in obfuscated p code or calls into compiled libraries. Many of those libraries are available independently of Matlab, but you can't see how they're calling them, or important details like how the Matlab interpreter or JIT compiler work.
I looked at OCaml a few years back for scientific computing, but was turned off by the global lock GC for multithreading (similar to python). Anyone know if this has been changed yet?
> I think multicore capabilities will be here soon.
Just a word of caution to set expectations. The "will be here soon" status has been there for a while, to the extent that it now evokes memories of GNU Hurd, Duke Nukem Forever. It will be ready whenever it will be ready, its in good hands, but don't hold your breath, try to work around it for now.
Multicore is picking up pace, and there will be a paper on the formalised memory model at this year's OCaml Workshop in Oxford in September.
There are a series of milestones to hit: the runtime GC, the memory model, the low-level programming model using one-shot continuations, and how it affects libraries running over it (e.g. algebraic effects). Each of these have associated papers and talks (see the ocamllabs.ionews section), so it's not quite fair to compare it to Duke Nukem Forever :-)
> so it's not quite fair to compare it to Duke Nukem Forever :-)
Apologies if it came out harsh. Oodles of respect for all the work you guys are doing. Its a lot of work and that's why I said things are in good hands.
I second on this. The multicore support is not far away. Right now we have a memory model for multicore OCaml and even a multicore OCaml ARM64 backend.
Also problematic for numeric code in Ocaml is the lack of native interoperability with C data types and native arrays and instead having to use time-consuming abstractions such as BigArray or marshalling via FFI. Unfortunately, solutions like as CTypes only seem to compound the problem by not addressing a particular issue with a runtime, but finding clever ways around the limitation.
Does it matter a lot for this sort of application (which I assume is interactive data analysis)? I think if you're worried about performance then you want a back-end to map your computation out to a big cluster, no? Like, write OCaml, target hadoop would be the logical thing if you need HPC. Of course someone will have to write that compiler (not it). :P
A lot of scientific computing is done in python (numpy) and even in Lua (using torch the great tensor library). Much of the computations are done in the GPU anyway.
We were just beginning to implement some linear algebra routines in ocaml for our library that parses text into structured data. This is a very good news. We will contribute back if we have any enhancements.
Have you tried the recent Linux subsystem for Windows, or Ubuntu for Windows? I have had a lot of success running various development tools in it. You can use pre-compiled Ubuntu binaries or compile your own stack, and it all runs at native speed, since there is no virtualization or emulation of any kind.
Or the Web Ontology Language (OWL) [1] whose inconsistent acronym is not in fact a tribute to the spelling-challenged Owl character in Winnie-the-Pooh [2]
I was actually looking for a good numerical library, supporting complex numbers, matrix operations, and multi-dimensional arrays. I'm looking forward to using and contributing to this one!