Keras is pretty much the best way to do almost anything these days. If you are starting out learning, use ConvNet JS, but after that switch to Keras.
TFLearn is really nice if you are already using Scikit.
There's lot of frameworks on there: TensorFlow, Caffe, CNTK (that's a lot of stars for something no one outside MS uses!) Theano, Torch etc. But I think the sleeper there is MXNet. I haven't used it, but I hear good things about it, and DLMC have good track record in producing some pretty nice software (XGBoost!).
Also, DeepDetect! I keep trying to find that and never can remember the name.
Brilliant, I've been looking for projects like this. I'm currently working through a couple of RBM C# projects but will add this to my list of reference code.
Top tip: If you use the matrix and vector classes in math.net then you can optionally configure it to use optimised version of e.g. matrix multiplication, that map through to one of the providers, such as Intel Math Kernel Lib, OpenBLAS, and I think there's a CUDA provider too.
I tried the Intel MKL one and the dense matrix multiplication was about 60x faster than a plain C# version.
Thanks for the tip. I'll see where I can apply it.
Most of the time is usually spent in the convolution layers. Convolution is not a matrix multiplication in the current implementation. I guess it would be a matrix multiplication in frequency domain or by using a Toeplitz matrix.
I've implemented a CPU Parallel version and gave a try at GPU implementation. But I'm not satisfied at all by the GPU version :)
> Convolution is not a matrix multiplication in the current implementation
I figure there's a code re-organisation task since propagating node activations through a layer of weights is essentially a matrix multiplication (fully connected => fully dense matrix).
The optimised routines make use of vectorised CPU instructions and the FMA instruction (fused multiply and add), all of which are perfect fits for [dense] matrix multiplcation. Not so great for sparse matrices, but they help, usually unless it's very sparse it's faster to use a dense matrix format with zeros for the missing weights.
It's a well designed API for using deep neural networks rather than an API for doing optimized mathematical operations.
Compare how you build some vaguely comparable models in Keras[1] and raw TensorFlow[2]. Keras uses TensorFlow (or Theano) underneath, so there is no performance penalty.
It's like in Python machine learning, most people use Scikit instead of implementing things in numpy.
Theano itself is more like a language, not a deep learning framework. There is no NeuralNetworkClassifier class, for example. Although, you could write a neural network library / framework using Theano, and it would have all the benefits of Theano (code compiled for the GPU, various common neural net ops available, etc.), which is what it looks like the Keras folks have done. I took a stab at this a while ago (1), but I didn't keep up on it. I haven't used Keras much, but it looks like it fills a much needed gap, which I'm glad for.
Very extensible API, accessible and widely used programming language (Python), the ability to use both Theano and Tensorflow as a backend and easy to implement non-linear neural networks (where data is split and merged at will) all contribute to this. Using Keras means you will almost never need to implement some custom layer or function, whilst sacrificing very little performance-wise.
CNTK is actually pretty damn good. It just lacks a good scripting interface. They're adding one though.
The network description language (now 'BrainScript') is a far nicer way to specify networks than the approaches used by any other network. Especially for recurrent networks. In CNTK you can just say `X = FutureValue(Y)` or `X = PastValue(Y)`. It's so convoluted in TensorFlow I actually never worked out how to do it.
It also has their fancy 1-bit SGD stuff, but I doubt many people use that, and it has a more restrictive license anyway.
Keras support for recurrent models leaves a bit to be desired at this point, so it's great if it has what you want, but otherwise you have to start peeking under the hood, which may be harder than just learning the underlying framework.
This is, frankly, a naive way to rank deep learning projects, because Github stars are cheap. Francois Chollet, the creator of Keras, comes out with a monthly ranking that takes other factors into account, such as forks, contributors and issues, all stronger signs of community and users. Here's his July update:
Most of these frameworks are Python-oriented: Keras, Theano, Caffe, TF, neon, Mxnet, etc. The space is saturated. If you look at deep-learning projects by language, then Torch stands out -- it has a Lua API. And Deeplearning4j is the most widely used in Java and Scala. You don't have to crowbar it into a Spark integration, like you do with TensforFlow. http://deeplearning4j.org/
MXnet is not talked about a lot, but it's growing fast. It was heavily used by Graphlab/Turi, recently bought by Apple, so the question is what will happen with it now.
This is true. Stars in this context are also more indicative of projects that are attractive to a more mainstream audience. Examples like Deep Dream and Neural style, where laymen visit the Github page and star the project because it's "cool" are prominently featured, while projects like Torch and Theano have had massively more impact on the deep learning world today.
> Stars in this context are also more indicative of projects that are attractive to a more mainstream audience.
"Popular Deep Learning GitHub Projects" would have been a more accurate title. It is mentioned in the text but the title mentions "Top" which gives a different impression than the word "Popular".
It's impressive how fast TensorFlow has become this popular (judging not only by the number of stars it has but also by the number of other projects in that list related to TF)
Torch is split across many different repositories. Many of the relevant issues occur in `torch/nn` or `torch/cunn`, for example. Only considering `torch/torch7` underestimates the popularity by quite a lot.
I wander how many people still implement their own networks as opposed to use these prepared frameworks. Or do you guys stick to single framework or use some sort of mixture of tools?
Implementing a basic neural network is a lot of work, to be honest. I tried in C, in fact even made it parallelizable. It was hell. Lots of hair-pulling. I thank the gods for Keras and keep my head down.
Unsurprising. Normally, you at least have something like BLAS to do linear algebra work.
It takes maybe 15 minutes in Python / numpy for something basic like a 2-layer net, but backpropegation is a little annoying to get right. Thus, tensorflow (or theano or caffe or whatever).
Keras is pretty much the best way to do almost anything these days. If you are starting out learning, use ConvNet JS, but after that switch to Keras.
TFLearn is really nice if you are already using Scikit.
There's lot of frameworks on there: TensorFlow, Caffe, CNTK (that's a lot of stars for something no one outside MS uses!) Theano, Torch etc. But I think the sleeper there is MXNet. I haven't used it, but I hear good things about it, and DLMC have good track record in producing some pretty nice software (XGBoost!).
Also, DeepDetect! I keep trying to find that and never can remember the name.