I'm wondering, given a random truth-table with N binary variables and 1 binary output, what is (worst case) the smallest network that can learn it? (In terms of number of parameters).
This is technically true, but I wonder how close you could get to 100% with minimal size. I would expect that you could get somewhere around >98% with a network that's a few megabytes in size.
Having gone through this tutorial (which is great!) and several others, I'm curious what is a good second step for the casual neural network learner?
It sounds like there is a growing bag of tricks neural network researchers are discovering to make training practical and stable for large data sets.
One example would be using relu activation - whenever I play with it in a simple tutorial like this one, training seems to explode and fail much more frequently, so I'm guessing either I'm missing another step people use, or there are some extra constraints on initial conditions?
Using a Gaussian for activation in my tutorials has tended to be more stable and converge much faster, but I assume there is a huge downside lurking somewhere to having a non-monotonically increasing function?
What are the tricks of the trade that a weekend warrior should investigate?
You may want to check out Andrew Ng's Deep Learning Specialization over on Coursera. [1] One of the courses is specifically about hyperparameter tuning and another about structuring your project. There is a lot of practical information scattered across all the courses.
Yes, I'm taking the specialization and having a blast with it. :-)
Based on the limited amount of information, I'm assuming that by "training explodes" you mean that your gradient descent never reaches a local minimum. Try lowering your learning rate? You may be "stepping over" the minimum.
You are both right. It is worthwhile knowing that something has been posted before, in order to review old comments if interested and good content is always worth repeating.
In any case, reposting is allowed, for several good reasons, which have been discussed in the past.