I get the motivation behind fast.ai's lessons, but I question whether it's the right approach. Their goal is to make deep learning accessible to programmers by reducing the mathematical background required. An analogous situation is the engineer who uses Autodesk to design car parts: He/she may not need to know the detailed implementation of the 3D graphics to use the software effectively.
The difference is that Autodesk relies on a mature, deterministic technology (3D graphics rendering). Deep learning is a stochastic process that depends on the data and the model. The training code, and especially the framework hooks, is the least important part. The example they give is three lines of code to train a cat vs. dog classifier. I've tried this classifier on a different binary image classification task: livers with and without tumors. It didn't work very well. There's lots of reasons: little variability between images, grey-scale images, different resolutions, etc. You can tweak the network, throw in more middle layers, try different kinds of layers, whatever, to get better results. All of that is guesswork if you don't understand what the CNN is doing at each stage. At this point in time you do need a formal education in linear algebra, calculus and statistics to investigate why a model does/does not work. It's not enough to know how to use the libraries.
On the flipside, you also need to know how to manipulate data and parse it into the correct format. This generally requires a year or two of programming practice in a good scripting language like Python. I will echo their thoughts that Ian Goodfellow's Deep Learning Book is remarkably lacking in this area. As a simple example, you cannot even use AlexNet without pre-processing your images to be 227x227 or 224x224 for GoogleNet. That's 10,000 images resized, labeled and loaded into the model before training can take place.
tl;dr IMHO in terms of being a competent user of deep learning: mathematics >= programming >>> knowing how to use a framework
The address exactly how you say deep learning should be taught in the course overview video. They aren't against learning all the maths, they just don't think you should do it first.
It's a top-down vs bottom-up teaching approach. The advantages are better motivation, better context and more immediate usefulness.
> All of that is guesswork if you don't understand what the CNN is doing at each stage.
Pff, it's still mostly guesswork even if you do understand what it is doing.
This is complete bullshit. You really don't need any math to make these systems work. Being able to preprocess and scrape data and conduct rudimentary error analysis is far more important. And the math you generally need to know is trivial.
This is just gatekeeping bullshit. Most of the time the right answer is "collect more/better data".
I really really agree with you. I also think that the fast.ai courses are amazing.
It is a question of motivation. If you're reasonably proficient at programming and want to hit the ground running on a specific application (especially in a domain that has well-established methods) fast.ai is probably what you're looking for.
If you're new to programming and mathematics, or want to work on the state of the art of these methods, you need to first truly understand more fundamental ideas. For those who fall into this camp, I am collecting resources (work in progress) and trying to organise them into a learning pathway: https://github.com/hnarayanan/deep-learning
You or anyone else who's interested is free to offer suggestions on learning material.
Thank you! I already had Prof. Yaser Abu-Mostafa's course in there too (also amazing), but I will check out Prof. Blitzstein's probability course as well.
I think so. Jpeg is an enabling technology just like classification using neural nets. The interface is relatively simple in both cases, the applications are plentiful. Some meta data needs to be supplied to make sure the results are optimal. Both are quite complex under the hood and would require ample study in order to re-create them or in order to fully grok how they operate in every detail. But for high level applications - even if such knowledge would give you an edge - that in-depth knowledge is not a 100% requirement.
The big difference is that jpeg creation with default parameters works fine for probably > 99.9% of use cases. A jpeg encoding the picture of a car will be just as fine as that of a cat. This is not at all the case with the current state of ML, as the originator of this thread has also pointed out with their example.
But there is actually one more problematic area for jpeg: encoding of graphs, drawings etc. with a limited color palette and straight lines. Here, jpeg artifacts become more visible, and to reduce them, you can either turn up the quality, or use a better approach like svg or png. For this, at least a bit of more technical knowledge is required. How many non tech-savy people do even know about svg or png?
But an even more appropriate comparison with ML would be to ask how to improve jpeg to better deal with straight lines. For this, you clearly need to understand the maths.
The comment you originally replied to points out all the ways in which the deep learning "interface" is not relatively simple, at least not if your problem has any sort of deviation from the most simple use cases. For a user, making a jpeg is a one-time one-command affair. If you think training a neural net can be reduced to this level of abstraction (with the knowledge we have today), you have either never used them in practice, or you've been very lucky with the complexity of the problems you've encountered so far.
All of that is guesswork if you don't understand what the CNN is doing at each stage.
Well, what the CNN is doing at each stage is very simple to understand. There is a forward pass which is a matrix multiply (and addition of there is a bias) and then the matrix weights are learned in the backward pass, which is just basic differentiation and chain rule application. Now I am not trivializing differentiation ( when you try take the differntial of a vector, you are tearing your hair apart) but it's fundamentally a simple concept to understand.
Even with this understanding, designing deep neural nets and tuning hyper-parameters is mostly guesswork. Yes, the frameworks have little or nothing to do with this.
What I've found is that TensorFlow is difficult to wrap your head around for programmers because it's more like a DSL. You declare a computational graph and then run it multiple times. So when you are declaring a computational graph, you have no way of debugging that graph, unless you run it. Also the conversion from numpy arrays to Tensors and back is an expensive operation.
PyTorch simplifies this to a great extent. You just create graphs and run like how you run a loop and declare variables in the loop. This is great for imperative programming. However, think about it - every time your graph is recreated. Now if it's just a variable re-initialization it's not a big deal, but we are dealing with Tensors so you give up efficiency for flexibility.
Again, all of this is immaterial for learning how to build deep neural nets. I would say, just stick to whatever framework you can wrap your head around. I am learning that my ability to tweak numpy arrays, visualize them in pyplot, load data from csvs using Pandas and the like will take me a lot further in learning deep learning.
It's not like you have to give up a lot - the graphs are simple data structures and creating them is not the expensive part of the training. The computation has to be re-done at every step in a static framework too, and this is the part that matters.
I like Andrew Ng's new Coursera deep learning course. You start by writing your own NN using just numpy and then slowly improve it over several sessions. By the time you get Tensorflow you've written several NN's from scratch and have a good understanding of how a NN works under the hood.
Do you really though, since your decisions are still going to be fairly simple like "more hidden layers" rather than solutions to differential equations. Couldn't you become competent just from having a broad experience training many networks for diverse purposes.
I'm currently working in parallel through fast.ai (practical programming) and Michael Nielsen's Neural Networks and Deep Learning ebook (basic mathematics behind NNs). I find that they complement each other well.
You missed the entire point of fast.ai. They believe that it's better to be able to do basic, practical stuff before diving deeper. Most people will lose motivation if they have to learn an insane amount of stuff before even getting started with the cool stuff.
I love seeing this, and not just because I love PyTorch[1].
It's also because I believe it's in everyone's best interest to have more than one widely used framework controlled by a single company (TensorFlow).
Also, I think fast.ai's approach to teaching deep learning is the right one for the vast majority of developers: start with practical, immediately useful know-how instead of theoretical underpinnings. People who want to delve deeper, say, so they can develop innovative architectures, can always do so at their own pace after taking fast.ai's course. There are a ton of other online resources for learning subjects like linear algebra, multivariate calculus, statistics, probabilistic graphical models, etc.
It's not controlled by Facebook in any way. It's true that a large part of the core team works there, but development is public and guided by community needs first.
I literally started this course yesterday. One of the annoying things is the software setup. Yes they provide an AWS image but it's fairly expensive and I already have a powerful GPU on my desktop. Unfortunately the Windows setup instructions are long, complicated, use out of date software, etc. You have to install a lot of different pieces of software, including Anaconda which apparently is yet another package manager (seriously?).
It's not quite as bad as the Javascript npm/gulp/bower/whatever insanity but it's not too far off. Get it together ML people!
I spent hours trying to get setup before finding crestle mentioned in the forums. It's preloaded with everything you need and is billed by the second with 25 hours free
I am currently learning Tensorflow and Keras. I found the learning curve quite steep with Tensorflow and the whole static computation graph thing puzzling at first (I thought it was dynamic...). Now I am a lot more comfortable with both.
Really welcome new libraries and frameworks to make deep learning more accessible. My only fear is that the field becomes cluttered with a myriad of frameworks like the JS world. It would add a lot of confusion and apprehension for people entering the field IMHO since they would not know what to use and where to start.
The fact that I haven't heard about PyTorch wrappers before always made me feel like it had nailed the balance between expressiveness and customizability.
He also states that PyTorch is hard [1], which does not seem to be HN's overall opinion. So I guess they found Keras was limited in customization and PyTorch required some boilerplate for loading/processing data and training loops, and this new framework tries to fill the gaps?
When can we expect this class? Also, Jeremy Howard recently commented that they were redoing the first Practical Deep Learning for Coders class, is this going to be the same course or will they be separate?
The way I understood his tweet is that they are basically rewriting the first course with PyTorch first, so I'm guessing the content will be identical, just the notebooks now using PyTorch in the background.
On Keras:
>>On the other hand, it tends to make it harder to customize models, especially during training. More importantly, the static computation graph on the backend, along with Keras’ need for an extra compile() phase, means that it’s hard to customize a model’s behaviour once it’s built.
What does that mean, customize models during training?
Also, how are dynamic-graph architectures performing vs. models where the architecture doesn't change? Are they winning competitions?
I benchmarked Keras+TF vs PyTorch CNNs back in May 2017:
1) Compilation speed for a jumbo CNN architecture: Tensorflow took 13+ minutes to start training every time network architecture was modified, while PyTorch started training in just over 1 minute.
2) Memory footprint: I was able to fit 30% larger batch size for PyTorch over Tensorflow on Titan X cards. Exact same jumbo CNN architecture.
Both frameworks had major releases since May, so I am sure these metrics might have changed by now. However I ended up adopting PyT for my project.
I like PyTorch better as well. Note however, that one of its dependencies, gloo, comes with the infamous PATENT addendum. You won't be using it unless you do distributed training tho.
gloo is only one of the three currently supported backends. One can easily switch to MPI, and pick an implementation that comes with a license you want.
The difference is that Autodesk relies on a mature, deterministic technology (3D graphics rendering). Deep learning is a stochastic process that depends on the data and the model. The training code, and especially the framework hooks, is the least important part. The example they give is three lines of code to train a cat vs. dog classifier. I've tried this classifier on a different binary image classification task: livers with and without tumors. It didn't work very well. There's lots of reasons: little variability between images, grey-scale images, different resolutions, etc. You can tweak the network, throw in more middle layers, try different kinds of layers, whatever, to get better results. All of that is guesswork if you don't understand what the CNN is doing at each stage. At this point in time you do need a formal education in linear algebra, calculus and statistics to investigate why a model does/does not work. It's not enough to know how to use the libraries.
On the flipside, you also need to know how to manipulate data and parse it into the correct format. This generally requires a year or two of programming practice in a good scripting language like Python. I will echo their thoughts that Ian Goodfellow's Deep Learning Book is remarkably lacking in this area. As a simple example, you cannot even use AlexNet without pre-processing your images to be 227x227 or 224x224 for GoogleNet. That's 10,000 images resized, labeled and loaded into the model before training can take place.
tl;dr IMHO in terms of being a competent user of deep learning: mathematics >= programming >>> knowing how to use a framework