A friendly Introduction to Backpropagation in Python

apetrov · on Nov 26, 2017

I had a really good time adapting Karpatny's blog post to python myself but it didn't give me sufficient understanding so i continued with [1], then [2] and finally deciphering [3].

[1] https://mattmazur.com/2015/03/17/a-step-by-step-backpropagat... [2] http://peterroelants.github.io/posts/neural_network_implemen... [3] https://iamtrask.github.io/2015/07/12/basic-python-network/

chewxy · on Nov 27, 2017

I'd like your opinions on this that I wrote: https://blog.chewxy.com/2016/12/06/a-direct-way-of-understan...

sushantc · on Nov 27, 2017

Thanks apetrov - checking out these posts!

partycoder · on Nov 27, 2017

Some suggestions:

1) I would make a better distinction between the function declaration and the program output. e.g: format the output differently. like gray.

2) Capitalization. "InvalidWRTargError". It would help if you could capitalize it as "InvalidWrtArgError". This is a guideline in most coding standards. https://en.wikipedia.org/wiki/Camel_case#In_abbreviations

3) Better naming:

- "getNumericalForwardGradient": Are there non-numerical gradients?

- "applyGradientOnce": A function is applied once per invocation by convention.

Then it would be good if you formatted using PEP8, as it is standard in Python.

sushantc · on Nov 27, 2017

Thanks partycoder! Points taken; will make some changes.

Regarding numerical gradients, named it so to differentiate it from analytical gradients, which leverage formulas from calculus. The "numerical" ones are calculated using (f(x+h)-f(x))/h every time.

ydidntithnkftht · on Nov 27, 2017

Python community conventions are not camel case for functions...

forwardAddGate

would be

forward_add_gate

and return is not a function call...

return(max(x,y)) or return(x+y)

would be

return max(x, y) or return x + y

spaces around operators...

x + y not x+y

spaces around function args...

def foo(a, b) not def foo(a,b)

and when calling...

foo(1, 2) not foo(1,2)

https://www.python.org/dev/peps/pep-0008/

Just things to think about when publishing python code for the greater community.

partycoder · on Nov 27, 2017

There's a package that verifies PEP8 for you. https://pypi.python.org/pypi/pep8

partycoder · on Nov 27, 2017

I see the difference now, thanks for the clarification.

kamyarg · on Nov 27, 2017

My eyes hurt.

https://www.python.org/dev/peps/pep-0008/

gspetr · on Nov 27, 2017

Sadly, this CamelCase style is too entrenched in the academia and is very often present in books that use Python but written by academics, not professional Python developers.

amelius · on Nov 26, 2017

Something I was wondering about lately: how can we back-propagate through a max-pooling layer in a neural network?

https://datascience.stackexchange.com/questions/11699/backpr...

LolWolf · on Nov 26, 2017

Yeah! Another way of seeing it is that the derivative is a small (infinitesimal) perturbation around a region of interest:

Any input that isn't maximal will be some finite distance away from the maximum, so any small enough perturbation won't change it (thus it has zero derivative). If we change the entry which is maximal, though, then the maximum changes proportionally to it (with proportionality constant 1), so we're done and the derivative is one for the maximal entry [0] and zero for any other ones.

---

[0] If there is more than one maximal entry, then any convex combination for the entries that are maximal is a valid "derivative-like" operator (i.e. subgradient).

sushantc · on Nov 26, 2017

Intuitive explanation of backpropagation from first principles with a simple python implementation

partycoder · on Nov 27, 2017

Superlatives in this phrase: intuitive, simple.

These should be ideally determined by the reader, not the author.