A Neural Network in 11 Lines of Python

danso · on July 14, 2015

At the risk of being pedantic...if you're not familiar with data/math work in Python, the `np` word refers to "numpy", which is an extension to Python that includes array and matrix math...so the OP needs 12 lines, the first being:

       import numpy as np

undergrowth54 · on July 14, 2015

That isn't pedantic at all. Python tutorials should always include imports.

ksenzee · on July 14, 2015

No, definitely not pedantic. In fact, I have only just now realized that "numpy" means "NumPy," instead of being a nonsense word that rhymes with "lumpy." Sigh.

namrog84 · on July 14, 2015

I will still always call it nump-pee, and never num-pie

Lumpy Numpy FTW

sciguy77 · on July 15, 2015

Me too! It drives professors nuts, but its just so much more fun to say.

mcmullen · on July 15, 2015

Revelation. Thanks.

scott_karana · on July 14, 2015

Thank you! Non-obvious for sure.

Bostonian · on July 14, 2015

I knew that np stood for numpy but had to Google to realize that the .T suffix meant transpose.

nine_k · on July 15, 2015

Famous antigravity in one line in Python: https://xkcd.com/353/

blazespin · on July 15, 2015

... Isn't grammatically correct. Pedantic, I know.

elyase · on July 14, 2015

Just for comparison this is almost the same using a library [1]:

  import numpy as np
  X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1]])
  y = np.array([[0,0,1,1]]).T

  from keras.models import Sequential
  from keras.layers.core import Dense

  model = Sequential([Dense(3, 1, init='uniform', activation='sigmoid')])
  model.compile(loss='mean_absolute_error', optimizer='sgd')
  model.fit(X, y, nb_epoch=10000)
  model.predict(X)

[1] http://keras.io

avyfain · on July 15, 2015

But this is not nearly as clear for a beginner, or someone who doesn't know what Sequential, compile, fit and predict are doing.

nudpiedo · on July 15, 2015

But this is the tested, maintained solution by experts in the topic. Hopefully.

w23j · on July 15, 2015

"Do Not Reinvent The Wheel Unless You Plan On Learning More About Wheels"

avyfain · on July 14, 2015

Thanks for this. It is really well explained. However, it did feel like it went downhill when you explained how the updating happens. At least it was less of a walkthrough than the previous sections.

Also, lines 22 and 23 in the three layer network are not intuitive for a beginner. Before, syn0 was (3,1) and now its (3,4), and the new syn1 is (4,1). Your previous explanation for this initialization was much better.

But again, awesome work, just being nitpicky. Really appreciate it!

matt1 · on July 15, 2015

I recently spent some time learning how backpropagation works and wound up putting together this step by step tutorial that you might find helpful: http://mattmazur.com/2015/03/17/a-step-by-step-backpropagati...

mrborgen · on July 15, 2015

This is brilliant. I've been looking for an explanation like this, which truly understands the perspective of a beginner. Thanks!

avyfain · on July 15, 2015

Great! I will definitely check it out!

williamtrask · on July 14, 2015

Hey avyfain, I'll look out for that in my next post. Thanks for the note.

fizixer · on July 14, 2015

Also recommended, 7 part video series (link is video 1):

https://www.youtube.com/watch?v=bxe2T-V8XRs

jbssm · on July 14, 2015

Can you suggest any good video explanation about NN but with some code with Python but using an actual NN library (like theano)?

fizixer · on July 15, 2015

Honestly I'm a beginner myself. Other then google search that you can do yourself, here's what might be helpful:

https://triangleinequality.wordpress.com/2014/08/12/theano-a...

shubhamjain · on July 15, 2015

Great work! I have been reading a practical guide to neural networks [1], and I am amazed that it is so comprehensible. I had assumed ML to be a lot more complicated. Although, I won't classify it as an easy craft but (maybe) it isn't so hard as it looks.

[1]: http://neuralnetworksanddeeplearning.com/chap1.html

bakul · on July 15, 2015

Some of you may like seeing the two level neural network code translated to kona (or k3)!

    nonlin:{:[y;x*1-x;1%1+_exp -x]}
    x:(0 0 1;0 1 1;1 0 1;1 1 1)
    y:+,0 0 1 1
    syn0:-0.5+3 1 _draw 0
    step:{L1:nonlin[x _mul syn0;0]
          L1err:y-L1
          L1delta:L1err*nonlin[L1;1]
          syn0+:(+x)_mul L1delta;L1}
    do[10000;L1:step[x;y]]
    L1
  (,0.009668795
   ,0.007862982
   ,0.9935916
   ,0.9921171)

k3 does this in 140ms, Kona in 2030ms on the same machine.

BrainInAJar · on July 14, 2015

This is a really fantastic article, I've been looking for a code-based "Neural network implementation for idiots" thing like this.

toxik · on July 15, 2015

I wonder if it'd work better with a leaky ReLU. Somebody wanna try? It's a _very_ simple mechanism. [1] Basically, instead of a sigmoid, let f(z·w) be the activation due to inputs z and weight w, then a(z) = z·w if z·w > 0 else 0.01 * z·w. Sigmoid for activation have several problems, most notably they suffer from vanishing gradient problems with bigger networks.

[1] https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Le...

arpgy · on July 22, 2015

Shouldn't the derivative of sigmod(x) be sigmoid(x)(1-sigmoid(x)) and not x(1-x). The code seems to implement the second.

compostor42 · on July 15, 2015

Can HN recommend any good neural net tutorials? (Preferably in Python)

vonnik · on July 18, 2015

This is an excellent course no matter which language you prefer:

https://class.coursera.org/neuralnets-2012-001/lecture

monstruoso · on July 30, 2015

Why do you say it is an excellent course? I started it and so far I found the professor to be uninspiring and the lectures lacking in preparation and detail.

I much rather read a book.

bodokaiser · on July 15, 2015

I done something similar some months ago. Maybe this helps someone though it should have same content: http://nbviewer.ipython.org/github/bodokaiser/data-science/b...

mjcohen · on July 15, 2015

Interesting fact about the sigmoid function: If you scale it so its slope at the origin is the same as the cumulative error function, it differs from that function by at most 0.017.

robmcm · on July 15, 2015

I don't get the point of these "x in n lines" posts.

Fewer lines of code doesn't always equate to simple, comprehensible code.

choudanu4 · on July 15, 2015

Well, I personally presumed that neural networks would require two or three orders of magnitude more of code. When its revealed that the code is actually much more manageable, individuals (such as myself) take the dive into the subject, knowing that it is something we can at least get through.

ratsimihah · on July 15, 2015

The final "Line 36" in your explanation should be "Line 39."

ratsimihah · on July 14, 2015

YES! Thank you!

blazespin · on July 15, 2015

This is amazing stuff. Just want more!

junto · on July 14, 2015

Anyone fancy translating this to C#?

cryowaffle · on July 14, 2015

Pretty trivial if you have a math library, I don't see anything here that is special. Why not give it a shot yourself?

candu · on July 15, 2015

To help you get started, some relevant documentation:

http://docs.scipy.org/doc/numpy/reference/ufuncs.html

http://docs.scipy.org/doc/numpy/reference/routines.random.ht...

http://docs.scipy.org/doc/numpy/reference/generated/numpy.nd...

;)

nikmobi · on July 14, 2015

very nice, thank you!