Hacker News new | past | comments | ask | show | jobs | submit login
NGraph: A New Open Source Compiler for Deep Learning Systems (intel.com)
125 points by jennifermyers on March 20, 2018 | hide | past | favorite | 23 comments



Why is computing so specialized towards Deep Learning these days, when it absolutely doesn't need to be? Optimized "kernels" are useful for other kinds of applications as well, such as scientific computing (linear algebra), and image and signal processing.

I can see it coming already: a fast matrix multiplication library, running on a deep learning solution, running on a graphics card. It's kind of ridiculous, isn't it?


PyTorch for example is often called a deep learning framework. But at its core it is a hardware accelerated linear algebra and tensors library with the support of auto differentiation. So it is well suited for implementing numeric algorithms, classic machine learning algorithms and differentiable programs. DL is really a subset of differentiable programs, but it happens to be a very interesting one.


It is already here, actually.

I do a lot of numerics related to simulating quantum computing hardware (which is just linear algebra of fairly large matrices). I have used Theano to write some of the simulations, and there are plenty of projects on github that use TensorFlow as a backend for quantum-mechanical simulations.

Either way, I do not care much whether they call it a "deep learning" framework. It is still directly usable in all other types of linear algebra applications.


That's nice, until someone decides to prune certain functionality from future versions of the library because they are not needed for Deep Learning. Or until someone decides: hey, we can replace this floating point computation by a fixed point computation, because it doesn't matter for any kind of Deep Learning anyway.


The maintainers of major DL frameworks like TF and PyTorch intentionally target an ever-wider set of use-cases beyond deep learning, and would not chop out non-DL functionality.

There's many cool non-DL projects using these libraries for fast computation like Pyro (PyTorch) & Greta (TensorFlow), and the community is starting to shift away from DL-specific vocabulary towards terms like "differentiable programming". Deep learning is just one use of this broader paradigm.


That's really not how it works at all.

Frameworks are layered, and the acceleratored tensor libraries are useful independently of the neural network architectures built on top of them.

It's basically a numpy/scikit like relationship, except both come from the same project.


(1) Current Deep Learning tools are still pretty bad; Tensorflow was a huge advance but people hate it.

(2) Differentiable computation has many other applications.


Computations are computations; you certainly don't need to use it for deep learning, but you might want to introduce some new graph-level optimizations if they are appropriate to what you are doing and haven't shown up in deep learning.


plenty of people are already doing this.

tensorflow can be used to run simulations and other scientific data processing. that's because tensorflow is just a retained computation graph that can be flexibly configured.

we have been doing this since before deep learning was the tech du jour and will be doing it long after.


Note no support for ATI gfx cards.


Not yet, but we would welcome contributions as an open source project!


It doesn't even support Intel's own GPUs (although I don't know how useful that would be).


Not yet, but our goal is to enable that in the future.


CUDA only on the GPU? That doesn't sound good.


For now we are targeting CUDA and CUDNN primarily on the GPU backend but we’d love contributions for other GPU transformers! We are highly interested in enabling optimized support for the hardware used in the AI space and are optimistic about support for present and future hardware targets. Please feel free to let us know in our GitHub issues if there is specific GPU or other hardware support you hope to see.


Any plans on message passing / clustering with this, so that multiple hardware nodes could be connected to the same computational graph over the network?

Assuming also that this has support for AVX-512 acceleration?


We have an allreduce op in nGraph to support data parallel using OpenMPI, and are investigating the best approaches to integrate with frameworks. We also plan to add more collective communication ops to support model parallel in future.


Yes, for AVX-512 acceleration.


I see Baidu's PaddlePaddle in the logos but no mention of it in the article.


Good catch! The PaddlePaddle logo is faded which means we don't yet have support for it but would love to add it in the near future.


Interesting, that's good to have more options.

Performance charts in the article are for training of CNNs on CPU. Are there non-educational use cases for that? How does CPU CNN inference speed compare?


We do have to optimize inference code paths for fast training. We are currently adding optimizations that target inference-only use cases (some PRs are already up). Will post data as soon as that is done. Stay tuned!


Also, does this have support for generating monolithic (preferably statically compiled) binaries for the compiled model?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: