Hacker News new | past | comments | ask | show | jobs | submit login

edit: Jack Poulson's Elemental, and Andreas Waechter's IPOPT are also written in C++, and are both very important libraries within scientific computing and optimization, to add a few more examples of the usefulness of C++.

I'm trying to make the point that C++ occupies a weird place in the abstraction-vs-performance tradeoff.

For high abstraction and friendliness, C++ doesn't do nearly as nice of a job as MATLAB or Python in exposing high-level scientific computing operations.

For performance, C++ does have some really nice high-performance libraries (Eigen comes to mind), but for the majority of use cases, you either need to write inline assembly to get the best possible speed, or you're already reusing a fast numerical kernel from another library, and you may be better off in a higher-level language.

This is a very subjective discussion, and there are many interesting, fast, C++ libraries that do important work for science. But your claim that C++ is the best language for such work would be highly contested by many high performance computing specialists. As it is, I believe there's a pretty rough split between C, C++, and Fortran supporters, with Python continuing to gain traction.

Here's another counter-argument. MPI is used in 99% of all scientific high performance computing codes. The latest version of MPI, MPI-3, drops explicit support for C++ bindings. If C++ was so dominant, why would explicit bindings be dropped?




>For high abstraction and friendliness, C++ doesn't do nearly as nice of a job as MATLAB or Python in exposing high-level scientific computing operations.

I think you responded to it yourself in the next paragraph.

While not quite the same as using MATLAB or Numpy; with Eigen, Blitz++ and Armadillo you can come really close syntactically. You do have to suffer through the compilation process but it comes with computational benefits. Numpy and its ilk are adversely affected by the limitation of their vectorization paradigm, some more, some less. This style creates needless copies, unnecessary and overly pessimistic loops. These cost performance. Eigen/Blitz++/Armadillo style libraries do not suffer from this problem. A common Pythonic way to recover from these deficiencies of Numpy is to use Cython (and numexpr although it has a very narrow scope). However, to see speedups in these tight numerical loops with Cython one really has to do manual indexing, write at a low level, not so for Eigen/Armadillo/Blitz++.

This, although very limited in scope, is a concrete example that shows that you can write at a higher level in C++ without incurring performance hits.


I think we're approaching this from two different directions: you from the library user's, and I from the library writer's perspective.

Sure, if there's a good library available that handles all the bottleneck computation, then there's no need to write anything in C++ – just use Python or whatever. But if you actually have to implement a kernel, you can't do it in Python or Matlab.

Regarding assembly: I still believe that assembly is unsuitable except for the simplest of algorithms. Sure, you can optimize the crap out of DGEMM or DAXPY, with SIMD, cache-optimization, and prefetching – but in the end, it's still a very simple algorithm, with a very simple data layout and predictable access patterns. But as soon as it gets a little more complex (e.g., graph algorithms, or a SAT solver), you can forget about assembly. Heck, I doubt you could even get a measurable performance benefit.

> But your claim that C++ is the best language for such work would be highly contested by many high performance computing specialists.

My personal, subjective, and completely unscientific opinion on this is that these people are domain experts, which know how to design fast algorithms and data structures, but not necessarily how to make the best use of a complex language like C++. Or they're extending legacy code. In short: for non-technical reasons (analogy: Haskell and Scala are better languages than Java, and yet a lot more Java code is written).

> The latest version of MPI, MPI-3, drops explicit support for C++ bindings. If C++ was so dominant, why would explicit bindings be dropped?

Maybe because they didn't offer much over the C bindings, and even some C++ projects (e.g. Boost) used them?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: