Show HN: Arraymancer v0.4, Nim tensor library now with OpenCL and Keras-like API

mratsim · on May 5, 2018

Here is a link to the documentation as well: https://mratsim.github.io/Arraymancer/ And the examples: https://github.com/mratsim/Arraymancer/tree/master/examples, including the mandatory XOR, MNIST convnets and the notorious FizzBuzz with neural networks ;).

Also while deep learning is a focus, I've added general linear algebra and ML features like a least squares solver and PCA.

narimiran · on May 5, 2018

Speed in benchmarks [0] looks impressive. But they seem to be run on the old version of Arraymancer (Arraymancer 0.2.90). Were there any speed improvements/regressions in the newer version(s)?

Is there any "production code" written in Arraymancer?

[0] https://github.com/mratsim/Arraymancer#speed

mratsim · on May 5, 2018

The code didn't change so there shouldn't be any difference on Nim stable. Unfortunately I catched a regression in Nim #devel branch itself which is very impacting if tensors are created in a tight loop.

All in all I take performance very seriously and regularly check the assembly generated and the memory overhead of the library. It should at least be as fast as any C, C++ library even if they are optimized by a compiler (Tensorflow, Tensor Comprehension/Halide or MxNet/TVM), my intermediate language is C, and my optimizing compiler is GCC/LLVM. Critical parts should even reach Fortran speed thanks to heavy usage of __restrict__ and assume_aligned compiler builtins.

I also take great care about how to implement any algorithm for numerical stability and speed for example using a numerically stable 1-pass parallel softmax_cross_entropy through a frobenius inner product that I didn't see in any library I checked (Caffe, Tensorflow, Torch).

mratsim · on May 6, 2018

The performance regression is fixed and performance is even 30% higher than before (which was hidden by the perf regression)!