How does the performance compare to say Eigen? Obviously you go Nd, while Eigen does 1 and 2d only. But would it make sense to benchmark it against other libraries like Eigen?
At the moment, we don't have SIMD acceleration, but we are working on it. See the `xsimd` companion project which is at the foundation of the SIMD acceleration work.