I get the motivation behind fast.ai's lessons, but I question whether it's the r...

IshKebab · on Sept 10, 2017

The address exactly how you say deep learning should be taught in the course overview video. They aren't against learning all the maths, they just don't think you should do it first.

It's a top-down vs bottom-up teaching approach. The advantages are better motivation, better context and more immediate usefulness.

> All of that is guesswork if you don't understand what the CNN is doing at each stage.

Pff, it's still mostly guesswork even if you do understand what it is doing.

Eridrus · on Sept 10, 2017

This is complete bullshit. You really don't need any math to make these systems work. Being able to preprocess and scrape data and conduct rudimentary error analysis is far more important. And the math you generally need to know is trivial.

This is just gatekeeping bullshit. Most of the time the right answer is "collect more/better data".

hnarayanan · on Sept 10, 2017

I really really agree with you. I also think that the fast.ai courses are amazing.

It is a question of motivation. If you're reasonably proficient at programming and want to hit the ground running on a specific application (especially in a domain that has well-established methods) fast.ai is probably what you're looking for.

If you're new to programming and mathematics, or want to work on the state of the art of these methods, you need to first truly understand more fundamental ideas. For those who fall into this camp, I am collecting resources (work in progress) and trying to organise them into a learning pathway: https://github.com/hnarayanan/deep-learning

You or anyone else who's interested is free to offer suggestions on learning material.

abhirag · on Sept 10, 2017

You already have Prof. Gilbert Strang's course in there(made me fall in love with linear algebra), I would like to suggest Prof. Joe Blitzstein's course (https://projects.iq.harvard.edu/stat110/home) on probability and Prof. Yaser Abu-Mostafa's course (https://work.caltech.edu/telecourse) on Machine Learning.

hnarayanan · on Sept 10, 2017

Thank you! I already had Prof. Yaser Abu-Mostafa's course in there too (also amazing), but I will check out Prof. Blitzstein's probability course as well.

jacquesm · on Sept 10, 2017

How many people using 'jpeg' productively to create every day applications do you think understand the DCT?

JustFinishedBSG · on Sept 10, 2017

I wasn't able to do a PhD with Stéphane Mallat now I can never use Jpeg2000 :(

jacquesm · on Sept 10, 2017

You could petition Ingrid Daubechies as a means of last resort.

tchalla · on Sept 10, 2017

Is that an appropriate comparison though?

jacquesm · on Sept 10, 2017

I think so. Jpeg is an enabling technology just like classification using neural nets. The interface is relatively simple in both cases, the applications are plentiful. Some meta data needs to be supplied to make sure the results are optimal. Both are quite complex under the hood and would require ample study in order to re-create them or in order to fully grok how they operate in every detail. But for high level applications - even if such knowledge would give you an edge - that in-depth knowledge is not a 100% requirement.

prance · on Sept 18, 2017

The big difference is that jpeg creation with default parameters works fine for probably > 99.9% of use cases. A jpeg encoding the picture of a car will be just as fine as that of a cat. This is not at all the case with the current state of ML, as the originator of this thread has also pointed out with their example.

But there is actually one more problematic area for jpeg: encoding of graphs, drawings etc. with a limited color palette and straight lines. Here, jpeg artifacts become more visible, and to reduce them, you can either turn up the quality, or use a better approach like svg or png. For this, at least a bit of more technical knowledge is required. How many non tech-savy people do even know about svg or png?

But an even more appropriate comparison with ML would be to ask how to improve jpeg to better deal with straight lines. For this, you clearly need to understand the maths.

gibberfish · on Sept 11, 2017

The comment you originally replied to points out all the ways in which the deep learning "interface" is not relatively simple, at least not if your problem has any sort of deviation from the most simple use cases. For a user, making a jpeg is a one-time one-command affair. If you think training a neural net can be reduced to this level of abstraction (with the knowledge we have today), you have either never used them in practice, or you've been very lucky with the complexity of the problems you've encountered so far.

deepGem · on Sept 10, 2017

All of that is guesswork if you don't understand what the CNN is doing at each stage.

Well, what the CNN is doing at each stage is very simple to understand. There is a forward pass which is a matrix multiply (and addition of there is a bias) and then the matrix weights are learned in the backward pass, which is just basic differentiation and chain rule application. Now I am not trivializing differentiation ( when you try take the differntial of a vector, you are tearing your hair apart) but it's fundamentally a simple concept to understand.

Even with this understanding, designing deep neural nets and tuning hyper-parameters is mostly guesswork. Yes, the frameworks have little or nothing to do with this.

What I've found is that TensorFlow is difficult to wrap your head around for programmers because it's more like a DSL. You declare a computational graph and then run it multiple times. So when you are declaring a computational graph, you have no way of debugging that graph, unless you run it. Also the conversion from numpy arrays to Tensors and back is an expensive operation. PyTorch simplifies this to a great extent. You just create graphs and run like how you run a loop and declare variables in the loop. This is great for imperative programming. However, think about it - every time your graph is recreated. Now if it's just a variable re-initialization it's not a big deal, but we are dealing with Tensors so you give up efficiency for flexibility.

Again, all of this is immaterial for learning how to build deep neural nets. I would say, just stick to whatever framework you can wrap your head around. I am learning that my ability to tweak numpy arrays, visualize them in pyplot, load data from csvs using Pandas and the like will take me a lot further in learning deep learning.

apaszke · on Sept 10, 2017

It's not like you have to give up a lot - the graphs are simple data structures and creating them is not the expensive part of the training. The computation has to be re-done at every step in a static framework too, and this is the part that matters.

dagw · on Sept 10, 2017

I like Andrew Ng's new Coursera deep learning course. You start by writing your own NN using just numpy and then slowly improve it over several sessions. By the time you get Tensorflow you've written several NN's from scratch and have a good understanding of how a NN works under the hood.

averagewall · on Sept 10, 2017

Do you really though, since your decisions are still going to be fairly simple like "more hidden layers" rather than solutions to differential equations. Couldn't you become competent just from having a broad experience training many networks for diverse purposes.

namelost · on Sept 10, 2017

There's nothing wrong with that, but the mathematical background really isn't that complicated. It's just linear algebra and calculus.

sanxiyn · on Sept 10, 2017

Well, there's also nonconvex optimization, which is not exactly easy. On the other hand, "just use Adam" mostly works...

bryananderson · on Sept 10, 2017

I'm currently working in parallel through fast.ai (practical programming) and Michael Nielsen's Neural Networks and Deep Learning ebook (basic mathematics behind NNs). I find that they complement each other well.

The Nielsen ebook: http://neuralnetworksanddeeplearning.com

tosser350 · on Sept 10, 2017

You missed the entire point of fast.ai. They believe that it's better to be able to do basic, practical stuff before diving deeper. Most people will lose motivation if they have to learn an insane amount of stuff before even getting started with the cool stuff.