I spent some time learning the high level concepts first—which I found to be a very useful initial orientation—but recently I've wanted to solidify my foundations and learn the math properly.
I've found the math primer (two sections: "Linear Algebra" and "Probability and Information Theory") in this free book to be excellent so far: http://www.deeplearningbook.org/ It's a little under 50 pages for both sections.
I've seen the basics of linear algebra covered in many different places, and I think this is the most insightful yet concise intro I've come across. I haven't started the probability section yet, so I can't comment on it.
I have been doing the same thing. I've also augmented each section with video lectures from Khan academy or other sources. For instance, their videos on the Jacobian were excellent for getting an intuitive understanding of it [1].
I also search for problems on the topic to help solidify my knowledge. You can almost always find a class that has posted problems for a section with answers.
Within a few hours of starting the course you'll have submitted an entry into the Kaggle Dogs vs Cats competition that scores in the top 50% of entries and achieves 97% accuracy. It's designed for coders who don't have a PHD in math. It's a very top-down approach, where you only get into mathematical details once you understand the high level models being used.
I find the lectures from Jeremy Howard [1] very helpful. He points out what seems to work and what doesn't and explains things very well. I'm up to lecture 4 and enjoyed them all.
I also like Andrej Karpathy's thorough explanation for backprop in his Lecture 4 cs231n video. The links to the videos are removed from the course page [2] for some reason, just google "cs231n video" and you will find the youtube links. The page on CNN is pretty good.
Andrew Ng's Machine Learning on Coursera is a good for building some solid foundations in ML in general. Keep in mind it's not limited to Neural Networks. The math is kinda light, you either seek to understand it or trust that it works and focus on the ML principles.
I have a background in mathematics and physics. In this course, I try to give a physical explanation whenever possible to give people good intuitions about what is happening. Level of math required: know how a matrix multiply works (and it gets re-explained)
they say 3 months? I would have thought each of the 4 modules was 1-2 full days if you want to blast through it. Maybe 8 weeks of Sundays if you do it that way. But you can definitely spend a lot of time on them and on the TensorFlow docs.
LAFF or the Andrew Ng Machine Learning courses are true semester courses, but I'm not actually sure this is.
From a programmer background with very few knowledge of mathematics, keras helped me to create some quite efficient CNN without all the code needed by tensorflow.
This simplicity implies limits for advanced users, but the tool is fantastic to apprehend deep learning.
"To help more developers embrace deep-learning techniques, without the need to earn a Ph.D". Oh, good, I can do this.
"These fundamental concepts are taken for granted by many, if not most, authors of online educational resources about deep learning". Yup, true with this one as well.
Of course, 9 lines is a little dense, even with numpy. In practice, I got more understanding out of the slightly longer version that clocks in at 74 lines including comments and empty lines. This is an enormously simple neural network: a single layer with just 3 neurons. My son described its intelligence as being less than a cockroach after it'd been stepped on.
It works though. It's able to accurately guess the correct response for the trivial pattern it's given. You can follow the logic through so you understand each simple step in the process. In a follow up blog post, there's a slightly smarter neural network with a second layer and a mighty 9 neurons.
These examples are very approachable. It's about as simple a neural network as you can get. If you're new to machine learning, understand how it works helps illuminate the more sophisticated networks described in Martin Görner's presentation.
Perhaps somebody here can help me with a sideproject that I'm working on. I'm trying to figure out the topology of a neural network that is capable of detecting the location and orientation of a given object. Say, a wrench. I don't want to use heatmaps (e.g. [1]) because they give just the location of the object and not the orientation. So the problem is basically how to choose the output quantities and how to encode them. The x and y coordinates of the head of the wrench could be quantities, but how to encode them? Should I use multiple output neurons per coordinate? And encoding the orientation is a similar problem. Would it even make sense to decompose the output in this way? Thanks in advance!
PS: More generally, is there a guide that explains how to robustly encode real numbers as output of neurons? I've tried to search for it, but couldn't find it.
There are algorithms for stuff just like that in OpenCV. Maybe you could find some inspiration or clarification by reading through the source code for the algorithms for a brief description?
But I was hoping for a more scientific answer. Like how do researchers approach this problem typically? And is there a strong consensus in this area among researchers?
Hm, there are problems in the field of computer vision that might help structure your cost/training algorithm; maybe pose estimation, the perspective-n-point problem, and point-set registration.
Nope, look at the videos. I try to give as much background information as possible within 3h. The goal, on the contrary, is to help you understand the basics so that you can build on a solid fundation rather than slappin' layers together! Not that slappin' layers'n'shit together is against my religion or anythings though - sounds like fun actually :-)
This is probably the most effective 3 hours I have spent trying to get my head around Tensor Flow (and NN in general somewhat). Hell I even get what CNN and RNN's -are- now.
Fancy math is useful for explaining why it works.
But this sort of content is good for explaining to engineers -how- it works. Which is ultimately how I need to understand things before the why is interesting to me.
I found Siraj Raval has a great youtube channel[1] about these topics, also "without a Ph.D" style, he explains dense topics in a fun way! (maybe not for everyone) Also has practical [2] videos for building things from the scratch (in python) to understand better the basic concepts.
I watched a version of this course a few weeks ago and it has cleared up a lot of things for me. Martin doesn't waste time on basic concepts and covers a lot of ground in 3 hours. It's probably the tutorial I learned the most from so far.
Python is one of the simplest languages to learn. I do this course as a hands-on lab with people who discover Python programming at the same time as they discover neural networks. I ask them to read this "Python 3 in 15 min" primer beforehand and they are good to go: https://learnxinyminutes.com/docs/python3/
There's a tensorflow R port, but it requires setting up a working version of tensorflow on python. So once you have that set up, all the documentation, error messages etc are for python - so you might as well just use python.
This is of course a matter of opinion, but objectively speaking, the most complex piece of math is a matrix multiply, which I re-explain anyway. This is an end-of-high-school level of mathematics.
I've found the math primer (two sections: "Linear Algebra" and "Probability and Information Theory") in this free book to be excellent so far: http://www.deeplearningbook.org/ It's a little under 50 pages for both sections.
I've seen the basics of linear algebra covered in many different places, and I think this is the most insightful yet concise intro I've come across. I haven't started the probability section yet, so I can't comment on it.