Hacker News new | past | comments | ask | show | jobs | submit login
A friendly Introduction to Backpropagation in Python (sushant-choudhary.github.io)
170 points by sushantc on Nov 26, 2017 | hide | past | favorite | 14 comments



I had a really good time adapting Karpatny's blog post to python myself but it didn't give me sufficient understanding so i continued with [1], then [2] and finally deciphering [3].

[1] https://mattmazur.com/2015/03/17/a-step-by-step-backpropagat... [2] http://peterroelants.github.io/posts/neural_network_implemen... [3] https://iamtrask.github.io/2015/07/12/basic-python-network/


I'd like your opinions on this that I wrote: https://blog.chewxy.com/2016/12/06/a-direct-way-of-understan...


Thanks apetrov - checking out these posts!


Some suggestions:

1) I would make a better distinction between the function declaration and the program output. e.g: format the output differently. like gray.

2) Capitalization. "InvalidWRTargError". It would help if you could capitalize it as "InvalidWrtArgError". This is a guideline in most coding standards. https://en.wikipedia.org/wiki/Camel_case#In_abbreviations

3) Better naming:

- "getNumericalForwardGradient": Are there non-numerical gradients?

- "applyGradientOnce": A function is applied once per invocation by convention.

Then it would be good if you formatted using PEP8, as it is standard in Python.


Thanks partycoder! Points taken; will make some changes.

Regarding numerical gradients, named it so to differentiate it from analytical gradients, which leverage formulas from calculus. The "numerical" ones are calculated using (f(x+h)-f(x))/h every time.


Python community conventions are not camel case for functions...

forwardAddGate

would be

forward_add_gate

and return is not a function call...

return(max(x,y)) or return(x+y)

would be

return max(x, y) or return x + y

spaces around operators...

x + y not x+y

spaces around function args...

def foo(a, b) not def foo(a,b)

and when calling...

foo(1, 2) not foo(1,2)

https://www.python.org/dev/peps/pep-0008/

Just things to think about when publishing python code for the greater community.


There's a package that verifies PEP8 for you. https://pypi.python.org/pypi/pep8


I see the difference now, thanks for the clarification.



Sadly, this CamelCase style is too entrenched in the academia and is very often present in books that use Python but written by academics, not professional Python developers.


Something I was wondering about lately: how can we back-propagate through a max-pooling layer in a neural network?

https://datascience.stackexchange.com/questions/11699/backpr...


Yeah! Another way of seeing it is that the derivative is a small (infinitesimal) perturbation around a region of interest:

Any input that isn't maximal will be some finite distance away from the maximum, so any small enough perturbation won't change it (thus it has zero derivative). If we change the entry which is maximal, though, then the maximum changes proportionally to it (with proportionality constant 1), so we're done and the derivative is one for the maximal entry [0] and zero for any other ones.

---

[0] If there is more than one maximal entry, then any convex combination for the entries that are maximal is a valid "derivative-like" operator (i.e. subgradient).


Intuitive explanation of backpropagation from first principles with a simple python implementation


Superlatives in this phrase: intuitive, simple.

These should be ideally determined by the reader, not the author.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: