Hacker News new | past | comments | ask | show | jobs | submit login
A Non-Mathematical Introduction to Using Neural Networks (heatonresearch.com)
54 points by Anon84 on June 9, 2010 | hide | past | favorite | 10 comments



Hmm... how should I put it?

I don't see why anyone non-mathematical would want using neuronal networks in the first place. There's lots of machine learning tools that are designed to work out of the box (decision trees, SVMs) whereas neuronal networks on the other hand give you greater flexibility but make it radically easier to shoot yourself in the foot if don't understand the whole thing. And "the whole thing" involves a gobload of math - including multidimensional calculus, little bits of probability theory, and numerical optimization.

That said, it's probably ok if you just want to get your feet wet to see what it's all about. There's even (decent) libraries for Python: http://deeplearning.net/software/theano/ http://code.google.com/p/pynnet/

Matching tutorial: http://deeplearning.net/tutorial/


This tutorial is particularly troubling, as the two examples are poorly chosen at best and dangerous at worst. The very basics of neural networks are not very tough, as it's just a black box where you give it some set of real-valued inputs and it learns to map to a set of real-valued outputs. Most people will never get a neural network to function well in a data mining context due to over-fitting. My issue with the two specific examples are:

1) Image recognition is done in so many different ways, but if you're using neural networks, why not take advantage of the geometry of the layout while simultaneously removing the need for learning about things like hidden node layers. Use HyperNEAT: http://eplex.cs.ucf.edu/hyperNEATpage/HyperNEAT.html

2) Financial data mining is potentially very dangerous. It's easy for people to create models that look very strong in back-testing but are actually just fitting to the relatively limited history of samples. Some people may be foolish enough to think they should take that signal and trade it with their own money-- yikes!

A much better introduction to ANNs would be in a reinforcement learning context, where you aren't worried about over-fitting as much. Mat Buckland's book does a pretty good job of covering these topics: http://www.amazon.com/Techniques-Programming-Premier-Press-D...


I don't think you need to know all the math behind NNs in order to successfully learn how to use one. I certainly don't and I've been using the encog library (by Heaton) for a while with great success. And you definitely do not need to be a mathematician to need one.

Yes, there are some general principles and concepts you need to know/understand, but nothing particularly deep.

Edit: Actually I'd highly recommend encog: it's very easy to use, it's extremely well documented and for those of us that do not have a formal education in NNs it's a great place to get started. http://www.heatonresearch.com/encog


I'm not sure you've actually addressed his concern. I think the ability to use one (in an engineering sense) and get seemingly good results is separate from the issue of fully utilizing them. You are likely better off using a much more hands off tool that requires little knowledge of the underlying mechanism (to his point, d trees and svm).


The mathematical explanation is surprisingly simple. A 2 layer neural network consists of 2 matrices A and B and an activation function f. The neural network computes this function where in and out are vectors:

    out = f(A*f(B*in))
where f is applied to each element of a vector. You can see how this generalizes to more layers.

Neural network learning algorithms are given (in,out) pairs and try to find A and B to minimize the mean square error, for example by using stochastic gradient descent.


This is a good summary. Now, here's what you need to know to have it work:

* initialize weights to small random values (otherwise it will quickly get stuck in a non-desirable local optimum)

* you need to decrease the learning rate of the stochastic descent according to a 1/(n+n0) or similar schedule, otherwise there's a risk of the whole thing swinging back and forth

* it often helps to normalize the input values so that they have a standard deviation of 1 and an expectation of zero. Or at least it's not good to have input values differ too much in scale.

* if you're training something that should give yes/no answers, you don't really want to minimize the mean square error, but a logistic loss.

Some of these parts are usually built into your NN toolkit, so you don't have to worry about them. Or they're not (e.g., choosing a good learning rate), in which case you're just screwed if you don't know what happens and why.

But you're right, mathematically ANNs are quite simple (and, as one has found out, don't really do similar things to what actual neurons do).


Could you provide a link to mathematical sound introduction into neural networks?

I was always turned out because it was always explained in some magical manner...


if f is the logistic function, then neural networks basically correspond to logistic regression.

http://en.wikipedia.org/wiki/Logistic_regression

which has been around since the 1940s... (multilayer neural networks correspond to hierarchical logistic regression--just plug them together). this should be in any reasonable stats book.

if you mean mathematically sound in terms of learning, etc, then chris bishops "neural networks and pattern recognition" is pretty good, full of sage advice and justification.


This is why neural networks were very popular for a while - they were sold to the public in a completely non mathematical way.


The fact they are called "neural networks" is testament to this :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: