Backpropagation 101 (2020)

teruakohatu · on April 2, 2021

This is a very, very convoluted way of understanding backprop.

If you want to understand it I highly recommend writing a neural network starting from a basic perceptron (no hidden layer) or even implementing basic linear regression with stochastic gradient descent. Then work your way up to a network with one hidden layer.

Try to focus on the maths and avoid looking at code examples.

reitzensteinm · on April 2, 2021

This is a great book that breaks down the process step by step. I had my own neural nets working before I touched a framework, and it demystified what would have otherwise seemed like black magic.

https://www.amazon.com.au/Make-Your-Own-Neural-Network-ebook...

29athrowaway · on April 2, 2021

The book I used to learn the basics many years ago was "An Introduction to Neural Networks", by Kevin Gurney. It's a bit outdated now, but I still consider it valuable.

datameta · on April 2, 2021

I absolutely second the suggestion of working through the math of a shallow neural network to start.

I would also suggest that once one thinks they have a sufficient grasp of the steps to try to implement them at a low level without frameworks, just math libraries. Personally, I found Octave (highly similar to MATLAB) to be a nice granularity tier for learning the process programmatically.

selimthegrim · on April 2, 2021

Is scholastic gradient descent anything like grad student descent? ;)

sxp · on April 2, 2021

This has a lot of words, but not enough content. I recommend 3Blue1Brown's neural net videos instead: https://www.youtube.com/watch?v=Ilg3gGewQ5U

adamfaliq · on April 2, 2021

This. And https://ml-cheatsheet.readthedocs.io/en/latest/backpropagati... . I am currently doing MIT Micromasters in Data Science, specifically the Machine Learning & Deep learning course. Last week's assignment was to build a neural network from scratch and these two resources have been a lifesaver.

ad404b8a372f2b9 · on April 2, 2021

This is all you need to understand backpropagation, I give this link to all my students: https://mattmazur.com/2015/03/17/a-step-by-step-backpropagat...

dragandj · on April 2, 2021

Let me chip in with some self-promotion.

This book explains and executes every single line of code interactively, from low level operations to high-level networks that do everything automatically. The code is built on the state of the art performance operations of oneDNN (Intel, CPU) and cuDNN (CUDA, GPU). Very concise readable and understandable by humans.

https://aiprobook.com/deep-learning-for-programmers/

Here's the open source library built throughout the book:

https://github.com/uncomplicate/deep-diamond

Some chapters from the beginning of the book are available on my blog, as a tutorial series:

https://dragan.rocks

phnofive · on April 2, 2021

Try reading the last two paragraphs of the article first; it won’t make much more sense unless (I suspect) you already understood the point it was intended to convey, but you’ll save some time being confused:

> By now it’s probably worth dropping the allegory: the “workers” in our story are models, which could be individual layers of a neural network, or even whole models. And the process we’ve been discussing is of course the backpropagation of gradients, which are used to iteratively update the weights of a model.

> The allegory also introduced Thinc’s particular implementation strategy for backpropagation, which uses function composition. This approach lets you express neural network operations as higher-order functions. On the one hand, there are sometimes where managing the backward pass explicitly is tricky, and it’s another place your code can go wrong. But the trade-off is that there’s much less API surface to work with, and you can spend more time thinking about the computations that should be executed, instead of the framework that’s executing them. For more about how Thinc is put together, read on to its Concept and Design.

kristianov · on April 2, 2021

Why are people so afraid of math?

jokoon · on April 2, 2021

I like math, but ML is applied science, so at some point, it's an algorithm, so I don't really understand why math is really required.

I'd rather read pseudo code than math equations.

I would just assemble all the bricks myself, read the code, and ask questions later.

I have problems with the scholastic method:

* fill students with theory

* tell them to execute the exercise

* hope they understood the material

* evaluate with an exam

* use grades and eliminate the ones who failed

This doesn't work for all students, or it works only for very very "structured" students who managed to learn this way. A lot of people learn by doing, and reading/writing math is not going to help them. That's why a lot of developers want to read the code. Math is a language, but it's not always very well defined and often, teachers have their own notation and usages, which can be frustrating.

Code is well defined, and always checked and verified by a parser. Code is more reliable than mathematical notation.

It's perfectly understandable for developers to refuse to use math to do machine learning, when machine learning is applied. Most of those developers want a crash course and be operational as fast as possible. The theory is not always required, unless you want to do research and go further.

I have nothing against mathematics, I love maths, but for ML, you need statistics and other simple things, you don't need to go very far in the maths. Leave optimization to people who want to optimize. Do things step by step. First step is: let people understand and use machine learning.

gspr · on April 2, 2021

Not only afraid of math. Afraid of math and at the same time confident that one will excel with deep learning.

anewhnaccount2 · on April 2, 2021

Lots of people make use of relational databases without formally understanding set theory/normal forms. They just learn by example and hacking it. What you're seeing here with deep learning is the same situation.

gspr · on April 2, 2021

Sure. And I didn't at all mean that people who aren't good at math should stop doing deep learning.

I just meant that it's weird to me that people seemingly tiptoe around using simple math to explain what is, at heart, simple math, and instead invent a whole new terminology and a dictionary of imprecise analogies to get by.

ganafagol · on April 2, 2021

Because "I've always been bad at math" is a hip thing to say in today's society and gets you laughter and approving nods.

cultyyy123 · on April 2, 2021

In reality most people are bad/afraid at Numbers not at Maths. The book Maths without numbers highlight what needs to be done to Education system.

tgv · on April 2, 2021

Sure. It's much easier to understand the Dedekind cut, rings, and function spaces than to subtract 35% from 140.

laichzeit0 · on April 2, 2021

Most people afraid of “math” only have exposure to the second kind. I figure it has more to do with finding those type of mechanical operations boring (they are), and hence all of maths must be equally boring. If you’re a software developer, enjoy abstractions, programming language theory, and solving problems there is no way you would find math of the first sort boring. Alas, in high school pretty much all the math is of the boring mechanical kind. At least mine was, anyway. Even the first exposure to non-rigorous calculus is of the shitty plug-and-chug variety.

tgv · on April 2, 2021

It's hard to go through life without developing a modest understanding of arithmetic and reading. You wouldn't even be able to check your receipts if school started with Peano's axioms. Consequently, numbers come first, as they have for millennia now. Which makes sense, because maths developed out of intuitive arithmetic.

As a matter of fact, my parents complain they never understood maths, because they were taught algebraic rules by rote. a * b = b * a. They can recite that, they just don't have a clue what to do with it.

Teachers just aren't good, and certainly not at maths. It's a struggle to teach billions the basic skills. There is no solution.

I also don't know any mathematician or physicist bad at arithmetic. Quite a few of them seem to delight in numbers and arithmetic puzzles.

So I really don't see how to avoid numbers and boring school work, nor how the former could have a positive effect.

ZephyrBlu · on April 2, 2021

My guess is because they were taught how to plug numbers into an equation and get an answer and didn't get taught math skills like algebraic manipulation.

This isn't a slight against people who aren't good at maths. I think the way education system work is pretty broken.

29athrowaway · on April 2, 2021

Backpropagation is interesting, but there is no evidence that animal brains are using it.