Hacker News new | past | comments | ask | show | jobs | submit login
A Neural Network in 11 Lines of Python (2015) (iamtrask.github.io)
234 points by williamtrask on Oct 18, 2017 | hide | past | favorite | 26 comments



3blue1brown is currently doing a terrific series on how neural networks work, which nicely compliments this blog post. https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQ...


I'm wondering, given a random truth-table with N binary variables and 1 binary output, what is (worst case) the smallest network that can learn it? (In terms of number of parameters).


For a 100% success rate, it would have to be of the size of the minimal BDD. If you can allow for some errors, this becomes an interesting problem.


Yes, but that's probably the theoretical minimum.

What I'm talking about is the size that is required so that the neural net can learn it. This may be different.


Can you define BDD, please?



This is technically true, but I wonder how close you could get to 100% with minimal size. I would expect that you could get somewhere around >98% with a network that's a few megabytes in size.


The author is currently writing this book:

https://www.manning.com/books/grokking-deep-learning


Having gone through this tutorial (which is great!) and several others, I'm curious what is a good second step for the casual neural network learner?

It sounds like there is a growing bag of tricks neural network researchers are discovering to make training practical and stable for large data sets.

One example would be using relu activation - whenever I play with it in a simple tutorial like this one, training seems to explode and fail much more frequently, so I'm guessing either I'm missing another step people use, or there are some extra constraints on initial conditions?

Using a Gaussian for activation in my tutorials has tended to be more stable and converge much faster, but I assume there is a huge downside lurking somewhere to having a non-monotonically increasing function?

What are the tricks of the trade that a weekend warrior should investigate?


I co-authored a blog post with my lab that has practical advice for debugging DNNs - some of these tips might be helpful to you? https://pcc.cs.byu.edu/2017/10/02/practical-advice-for-build...


Stanford CS231n: Convolutional Neural Networks for Visual Recognition [1]

The assignments are excellent and will let you implement a deephish network from practically scratch, including backprop, optimizers, tuning etc.

[1]: http://cs231n.stanford.edu/index.html


You may want to check out Andrew Ng's Deep Learning Specialization over on Coursera. [1] One of the courses is specifically about hyperparameter tuning and another about structuring your project. There is a lot of practical information scattered across all the courses.

Yes, I'm taking the specialization and having a blast with it. :-)

[1] https://www.coursera.org/learn/neural-networks-deep-learning


Based on the limited amount of information, I'm assuming that by "training explodes" you mean that your gradient descent never reaches a local minimum. Try lowering your learning rate? You may be "stepping over" the minimum.


this is my APL and J implemention: https://github.com/ghosthamlet/ann.apl



If this blog post interests you, you may be interested in his book that is forthcoming: https://www.manning.com/books/grokking-deep-learning


Nice discussion :)



Quite a while ago, not sure if that's a problem. It's a pretty great write up and worthy of posting again. For me, this was the first time seeing it.


You are both right. It is worthwhile knowing that something has been posted before, in order to review old comments if interested and good content is always worth repeating.

In any case, reposting is allowed, for several good reasons, which have been discussed in the past.


Indeed, I had not considered that. I may have been used to people saying that links shouldn't be posted twice. The extra context is in fact nice!


It's nice to have the context of past discussions.


Ah, very good point. I hadn't considered that. :)


After a year or so it's ok: https://news.ycombinator.com/newsfaq.html. Or if the story hasn't had major attention yet.


Perhaps a 2015 label would be appropriate?


Doh yes added thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: