Why is computing so specialized towards Deep Learning these days, when it absolutely doesn't need to be? Optimized "kernels" are useful for other kinds of applications as well, such as scientific computing (linear algebra), and image and signal processing.
I can see it coming already: a fast matrix multiplication library, running on a deep learning solution, running on a graphics card. It's kind of ridiculous, isn't it?
PyTorch for example is often called a deep learning framework. But at its core it is a hardware accelerated linear algebra and tensors library with the support of auto differentiation.
So it is well suited for implementing numeric algorithms, classic machine learning algorithms and differentiable programs.
DL is really a subset of differentiable programs, but it happens to be a very interesting one.
I do a lot of numerics related to simulating quantum computing hardware (which is just linear algebra of fairly large matrices). I have used Theano to write some of the simulations, and there are plenty of projects on github that use TensorFlow as a backend for quantum-mechanical simulations.
Either way, I do not care much whether they call it a "deep learning" framework. It is still directly usable in all other types of linear algebra applications.
That's nice, until someone decides to prune certain functionality from future versions of the library because they are not needed for Deep Learning. Or until someone decides: hey, we can replace this floating point computation by a fixed point computation, because it doesn't matter for any kind of Deep Learning anyway.
The maintainers of major DL frameworks like TF and PyTorch intentionally target an ever-wider set of use-cases beyond deep learning, and would not chop out non-DL functionality.
There's many cool non-DL projects using these libraries for fast computation like Pyro (PyTorch) & Greta (TensorFlow), and the community is starting to shift away from DL-specific vocabulary towards terms like "differentiable programming". Deep learning is just one use of this broader paradigm.
Computations are computations; you certainly don't need to use it for deep learning, but you might want to introduce some new graph-level optimizations if they are appropriate to what you are doing and haven't shown up in deep learning.
tensorflow can be used to run simulations and other scientific data processing. that's because tensorflow is just a retained computation graph that can be flexibly configured.
we have been doing this since before deep learning was the tech du jour and will be doing it long after.
For now we are targeting CUDA and CUDNN primarily on the GPU backend but we’d love contributions for other GPU transformers! We are highly interested in enabling optimized support for the hardware used in the AI space and are optimistic about support for present and future hardware targets. Please feel free to let us know in our GitHub issues if there is specific GPU or other hardware support you hope to see.
Any plans on message passing / clustering with this, so that multiple hardware nodes could be connected to the same computational graph over the network?
Assuming also that this has support for AVX-512 acceleration?
We have an allreduce op in nGraph to support data parallel using OpenMPI, and are investigating the best approaches to integrate with frameworks. We also plan to add more collective communication ops to support model parallel in future.
Performance charts in the article are for training of CNNs on CPU. Are there non-educational use cases for that? How does CPU CNN inference speed compare?
We do have to optimize inference code paths for fast training. We are currently adding optimizations that target inference-only use cases (some PRs are already up). Will post data as soon as that is done. Stay tuned!
I can see it coming already: a fast matrix multiplication library, running on a deep learning solution, running on a graphics card. It's kind of ridiculous, isn't it?