The inventor of GANs, which have been considered the most interesting idea in ML...

amenod · on Jan 4, 2018

> The most famous problem of GANs is instability during training and mode collapse - which is like a student learning especially for an exam (and not in general) thus optimising for the test instead of the real thing.

I must confess I haven't worker with GANs yet, but isn't that the whole point of GANs? Student is optimising for the test while the teacher is learning how to make tests as similar to reality as possible?

If I understand correctly, the main challenge is finding a way to allow teacher and student (well, generator and adversary) to learn at a similar rate, so that one doesn't stop learning because its competitor is too advanced. Is that correct?

vinn124 · on Jan 4, 2018

> but isn't that the whole point of GANs?

not quite, but youre on the right path.

think about it this way: you (the generative model) are trying to predict a unit gaussian, which is just a fancy way to say bell curve. you get +1 if you predict a number in this distribution (eg 0.1 or -0.5, which is within one standard deviation of the mean of 0); you get -1 if you predict a number thats "far" from this distribution (something like 40 - which has an infinitesimally low probability of being drawn from a unit gaussian).

mode collapse, then, is when you predict 0 all the time. yes, you are technically correct but youve failed to learn the true distribution.

obviously ive simplified this quite a bit and have anthropomorphize the model, but i hope you get the gist. otherwise, the [original paper](https://arxiv.org/abs/1406.2661) is refreshingly easy to read.

amenod · on Jan 6, 2018

Thanks!

bitL · on Jan 4, 2018

> private ML tutoring from him

> I didn't get lessons because I gave up and eventually took the MOOC

Udacity still didn't get him onboard. I took DLF ND because of the tutoring they promised, did GANs as my first project to be in the queue, then graduated later still with no mentoring sessions. So you didn't miss anything by dropping out. How were Ng's new lessons? Worth taking it if I did DLF + fast.ai already?

BTW, GANs main use might be allowing almost fully unsupervised learning by extending small datasets with believable data.

taneq · on Jan 5, 2018

> BTW, GANs main use might be allowing almost fully unsupervised learning by extending small datasets with believable data.

I've wondered if dreams are basically this. Your brain uses its world-model-prediction subsystem to generate plausible inputs against which to train its action-generation-policy subsystem. Then, in real life, the action-generation-policy subsystem can react much more appropriately and quickly to real events.

Also, toddlers' stream-of-consciousness babbling when they first start talking. They narrate everything and more than once I've wondered if it's essentially them generating their own verbal training data. When they start talking to themselves their pronunciation, grammar etc. start improving much more rapidly.

claytonjy · on Jan 4, 2018

> RL is supposed to be the way to AGI

Could you expand on that? The more I read from folks like LeCunn & Chollet seem to disagree strongly. Just this week Yan posted about unsupervised modeling (with or without DL) to be the next path forward, and described RL as essentially a roundabout way of doing supervised learning.

bitL · on Jan 5, 2018

RL/DRL assumes world is Markovian, i.e. past doesn't matter between two states, which is way too simple. It requires huge amount of tries/episodes and properly tuned exploration-exploitation ratio. It is somewhat based on biological reinforcement learning, so there might be basis in reality as it is with convolutional neural networks and visual field maps in visual cortex (even if very rough approximation). DRL is the technique that allows modeling decisions; so for predictions you have CNN/RNN/FCN, for generation GANs and for decisions DRL; together they are closest to AGI we have right now.

halflings · on Jan 5, 2018

> RL/DRL assumes world is Markovian, i.e. past doesn't matter between two states, which is way too simple.

There's plenty of RL papers using RNNs and some types of memory networks.

bitL · on Jan 5, 2018

Likely as value function approximators for one piece of the whole algorithm (as is the case with DQN/DDQN). However the main algorithm is likely using variation of Bellman equation, that assumes Markovian property and gives strong guarantees about convergence.

gwern · on Jan 5, 2018

If you're using DQN or pretty much anything in DRL, you don't have any guarantees about convergence in the first place, and using a RNN does give you the history summary you need (at least up to the minimum error achievable with that fixed-length summary, not that that is any more likely to converge than the overall DRL algo is).

bitL · on Jan 5, 2018

I meant that under Markovian assumption value iteration used for Bellman equation is guaranteed to converge. So it makes math people happy, even if such property doesn't hold in the real world nor in the problem they try to solve, and the "deep" in DRL is just heuristics, though surprisingly working in many cases.

vinn124 · on Jan 5, 2018

that is true: popular rl techniques (eg policy gradients) are very similar to "vanilla" supervised learning techniques and architectures, but they are unsupervised in the sense that they required zero human input.

alphago zero is the canonical example of tabula rasa machine learning.