Hacker News new | past | comments | ask | show | jobs | submit login
Game Theory Reveals the Future of Deep Learning (medium.com/intuitionmachine)
200 points by curiousgal on Jan 15, 2017 | hide | past | favorite | 67 comments



Adversarial networks can be viewed as different agents competing against each other. But beyond that, I don't see how game theory is relevant or what insights it provides. This article is super vague and confusing to me.

>Where do these regularizations come from and how can we select a good regularization? How do we handle impartial information? This is where a game theoretic viewpoint becomes important. Generalization is sometimes referred to as ‘structural risk minimization’.

Ok... and what useful insight do the game theorists have here? He doesn't elaborate at all. Regularization is a very well studied concept already. I don't see how this is connected to game theory at all. Regularization is all about priors and Occam's razor.


I believe that the key point here is that as adversarial networks compete against each other repeatedly they eventually will converge on a Nash Equilibrium.

Nash Equilibrium from Wikipedia: "Amy is making the best decision she can, taking into account Phil's decision while Phil's decision remains unchanged, and Phil is making the best decision he can, taking into account Amy's decision while Amy's decision remains unchanged."

At the point this happens, a stable solution exists which is almost like saying that we have made the most rational decision given the information we had. Nash was novel in this theory to say that if the participants were rational they would converge on an equilibrium and not get lost in endless recursion. Endless recursion being something like participants endlessly guessing the other opponent move... If Amy knows Phil knows Amy knows Phil knows Amy knows Phil... Obviously it would be problematic for DL to plunge into infinite recursion.

I like to think of the Nash Equilibrium as a balanced teeter totter. When the teeter totter becomes off balanced rational systems will relearn to balance the odds again based the properties of risk (what Nash referred to as utility I believe). Once they do so, they once again converge on an equilibrium.


The problem with statement "as adversarial networks compete against each other repeatedly they eventually will converge on a Nash Equilibrium" is that they often won't, and a big problem with training GANs is to ensure that they converge.

Intuitively, what can happen is that one of the problems is simpler to learn than another, and when one of the "players" becomes overwhelmingly good, then the other part of the network stops receiving useful feedback and is unable to find out a direction for improvement, the learning stops. A "sufficiently smart" method would be able to go to a Nash equilibrium even in this case, but current GAN methods can not, so you need to take steps to ensure that it doesn't happen - e.g. extra normalization or less training for the "simpler to learn" component.

It's not enough to train the "adversarial" networks to compete during each inference, on a meta-level you must ensure a form of cooperation to ensure that they are effective teachers for each other during the learning process. For a real life analogy, there's a difference between effective behavior during combat and effective behavior during sparring.


Yeah, makes sense. It seems like when a network stops providing useful adversarial feedback that in and of itself is a learning experience for the smarter network. While it didn't reach an equilibrium it now knows it's at least smarter then the other guy. I feel like it should be able to use that experience of winning to beat the next guy.

It seems like for AI a perfect equilibrium wouldn't be easy to reach, its often hard for the human brain to reach one after all. It'd be the fact that an equilbrium exists though and that it's trained to find it that generates the knowledge along the way. Kinda like a journey not the destination learning method. I'm just theorizing though.


Doesn't the caveat at the end of your citation from Wikipedia prevent endless recursion? Both players assume that the other player will not change their decision.


The article is weird but also interesting to me at the same time.

My background is in more traditional statistical modeling. The adversarial networks paradigm is really fascinating to me because it's a new way to think about model fit problems. It may turn out to be the same as some traditional model fit statistics approaches, but they're usually not thought of as competitive, which seems to open up a new way of thinking of it.

At some point in the author's discussion of game theory, I began thinking that if DL does go that route, it will eventually rediscover Bayesian statistics, which can be derived from decision-theoretic principles. It seems like many things in DL have parallels in traditional statistical theory, but are expressed in a mechanistic way.

I guess my broader point is that I had the opposite reaction as a lot of people: rather than wondering how game theory might be relevant, to me it seemed inevitable, given the important role it plays in basic derivations of major inferential paradigms. Those derivations usually leave the mechanisms of decision theory a black box, and in DL the making the contents of the box is the focus.


> I don't see how game theory is relevant or what insights it provides.

The wiki entry of Nash equilibrium [1] and this recent preprint on improving GAN training [2] may help you understand the link between GAN and game theory. To me, this link seems plausible but very vague so far. Training GAN has not directly benefited from the math foundation of game theory. Correct me if I am wrong.

[1]: https://en.wikipedia.org/wiki/Nash_equilibrium

[2]: https://arxiv.org/abs/1606.03498


If nothing else, I think the idea was initially important because it provided the insight to pit two networks against each other in a competition as an optimization strategy.

Now, since the competing networks can be formally characterized as a minimax two-player game, other theorems relating to these games may provide further insight in how to proceed.


My background in game theory is from:

(1) careful study, teaching, and applying the detailed mathematics of linear programming and, in particular, its approach to proving the basic von Neumann saddle point theorem of game theory,

(2) applications of game theory to US strategic systems analysis, e.g., the Blue side positioning submarines and targeting multi-warhead nuclear missiles and the Red side allocating submarine search resources so that if Red shoots first Blue gets maximum expected target value destroyed,

(4) study of game theory from

Guillermo Owen, Game Theory, W. B. Saunders.

(5) study of game theory, including Nash's result with a nice proof by Lemke and Sion's result, in

T. Parthasarathy and T. E. S. Raghavan, Some Topics in Two-Person Games, ISBN 0-444-00059-3, American Elsevier.

With this background, I read the OP and found no clear connections with anything in game theory.


So is [1], for example, misleading?

[1] https://arxiv.org/abs/1406.2661


That paper doesn't mention game theory or the Nash equilibrium at all.

Perhaps you mean [2], which does. However, the article seems to misinterpret this: the fact that that system approximates the Nash equilibrium should be expected. Any system (including humans) that performs well in those types of games will, almost by definition.

[2] https://arxiv.org/pdf/1603.01121.pdf


It doesn't mention Nash equilibrium, but the whole approach they are discussing comes from the idea of pitting two networks against each other in a "minimax two-player game".

"We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models ... This framework corresponds to a minimax two-player game."

Edit: Regarding the paper you linked to. The fact that a Nash equilibrium is expected is an important part of why it's interesting to structure things this way: it's an optimization process guaranteed to stabilize (whereas it's problematic that other approaches will diverge).

"When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged"


I think we agree?

The articles claims (based on the paper I linked, and others) that:

What we see in these 3 players are 3 different ways game theory plays in Deep Learning. (1) As a means of describing and analyzing new DL architectures. (2) As a way to construct a learning strategy and (3) A way to predict behavior of human participants. The last application can make your skin crawl!

Of these claims, I think all are wrong.

One can certainly make the case that it is interesting to measure the performance of a system compared to the Nash equilibrium (when possible), but the author seems to think that the designers of the system are somehow using game theory to design the system.


They are in one of the cases pointed out, not the other two. The novel idea is to have two networks compete against each other in a two-player minimax game, developing progressively better strategies until something like Nash equilibrium is reached--at which point you have one well-trained classifying network and one well-trained generating network.


I can't comment on "misleading" without reading, likely studying, the article you mentioned.

There's a lot to game theory, e.g., one I omitted, differential game theory as in a missile chasing an airplane.

But the likely most relevant, first cut, simple view is that each of the two players, Red and Blue, has a list of moves they can make. Then there is one (von Neumann case) or two (Nash case) payoff matrices. Then each of Red and Blue picks a move or a probability distribution of moves.

The broad idea is to have and use a saddle point, maybe available only by making moves probabilistically in which case each of Red and Blue finds a probability distribution over the moves they can make and then picks a move according to their distribution and independent of everything else relevant.

So, just as in the classic children's game paper-scissors-rock, each player picks a move in secret, and then both players show their moves at the same time. Each player seeks to maximize their long run expected gain. As can be shown easily from linear programming and its duality theory, for paper-scissors-rock the saddle point probability distributions are that each player picks each of the three moves paper, scissors, rock with probability 1/3rd. Of course, Red and Blue pick their moves independently and in secret. The matrix for paper-scissors-rock is 3 x 3. But the theory and the linear programming solution applies the same for positive integers m and n for m moves for Red, n moves for Blue, and a matrix m x n. All the components of the matrix are >= 0 and represent, say, a gain for Red and a loss for Blue.

The classic von Neumann result shows how to get a saddle point from a payoff matrix with one row for each move for Red and one column for each move for Blue. Again, we can get this result from linear programming duality theory.

The Nash result is a little more general.

Sion's result is really just a case of max-min in mathematical programming, i.e., optimization.

Yes, convexity can play an important role, but we would need to be clear on just what is convex. Convexity is central to Sion's result. Yes, the theory gets involved with some classic fixed point results.

So, for machine learning, for a start, is the game even roughly in the sense of classic von Neumann and paper-scissors-rock? If so, what are the moves and, then, the payoffs? Getting that far, are we looking for two probability distributions? Then with the distributions, how do we apply those to the machine learning calculations, iterations, optimizations, improvements, etc.?

Little questions like those.

So, to start to apply game theory, let's start with a problem and formulate it as a problem in some appropriate part of game theory.


The author is wildly extrapolating limited results ("Nash Equilibrium outcomes achieved by agents trained using deep reinforcement learning") to the entire field. Just because in DRL agents approach nash equilibrium in limited set of games does not means that somehow using nash equilibrium is going to help design future algorithms. E.g. for several games such as Pong, Go on which we know that DRL outperforms traditional methods, there is no tractable Nash equilibrium.

Also let alone so many other problem which come under umbrella of deep learning such as recognition / detection/ Seq to Sequence prediction where there is no "Game theory" component.


My understanding is that the use of game theory in this context has nothing to do with the domain in which the network is being trained. It's not about creating game playing AI's—it's about a 'game' between two neural networks, one who competes by getting better at classifying, and the other by getting better at generating. That basic architecture is quite general—which I think is why you also see e.g. LeCun being excited about this direction.

To be fair, the article does a kind of weird thing in discussing three different ways game theory has shown up in the context of machine learning recently—but they're all separate ideas really.

Edit: a more precise description in the abstract here: https://arxiv.org/abs/1406.2661


> Also let alone so many other problem which come under umbrella of deep learning such as recognition / detection/ Seq to Sequence prediction where there is no "Game theory" component.

Ah, but there is! GANS can be applied to all those problems.


The author's description of "approximation of Nash equilibrium" is a description of the use of contradiction in the time-evolution of the two internal components of GANs


Game theory is a fun way to tell stories... but it is seldom predictive.

Conversely, machine learning seldom provides a story (at least in a human readable form) but it is predictive by nature.


This "post-hoc" nature of game theory bothers me a lot. People love the stories they can tell about things using game theory, but I have seen far fewer example of people understanding a new scenario using it.


Auctions are good example of a domain where game theory (and reverse game theory or mechanism design) often makes useful predictions and has practical influence. Google and Facebook use mechanisms that approximate the VCG allocation rule for some of their ad content and know that they should not be able to find massive improvements on this due to theory. I agree that it is often less predictive and more like stories.


Is this documented? I'd be curious to see some studies on it, if you know of any.



Game theory is part of the toolbox used to design, study and understand machine learning algorithms. It's currently the bleeding edge of machine learning.


Personally, I think information theory is going to end up being a lot more important to AI than game theory.


Well here as are a few predictions made by game theory.

1. There wasn't a nuclear war today

2. Every telco in the world gives large chunks of money to governements due to auctions

3. Ebay sellers get higher prices than they should

Also, I can read association rules, decision trees and lisp programs ok... and yet my machine learning programs (ok ok nowadays the ones I get out of various apis and packages) have written them...


1 and 3 aren't predictions, but post-hoc explanations.

There is an argument that game theory helped lead to 1, I suppose. But, in general, I think that is giving them more credit than they would even ask for. Mutually assured destruction is far from their central theorem.

2 could just as easily be explained by standard supply/demand scenarios. There is a very limited supply of that is auctioned off to telcos and they have a very vested interest in buying as much as they can. Even if they are not going to utilize it.


What!!!

1) You can allocate credit how you like, but there's a dominant strategy for that too.

2)Ok for the UK case see

http://www.nuff.ox.ac.uk/users/klemperer/biggestpaper.pdf

And look them up.

Interestingly your position makes me wonder if you are actually an exec working in a senior finance position in a major telco?


I'm looking forward reading that more in depth, but I am curious what you think my claim was.

I gladly admit I am using a layperson view that the spectrum auctioned off to telcos is a) likely to just get bought up by entrenched actors, and b) is unlikely to lead to any innovations in the marketplace.

Evidence that I am wrong on either of those points would be fun to read. My second claim, in particular, is one that I suspect is ill-founded.


EU economies auctioned off 3G spectrum at the same time. The ones with sloppily designed allocation rules raised far less money (controlling for GDP).


Yeah, that is what the linked paper was about. However, I'm not sure I care too much for the goal of raising more money. In some ways, I would view money raised as heat generated for a mechanism. That is, a necessary by product that is certain to correlate heavily with any outcome, but ultimately waste that should be minimized.

The cynic in me sees this as yet another way that economists have found to increase the wealth generation capabilities of valuable things, while at the same time losing sight of the utility that was there.


So the money either goes to government or the telcos' employees and shareholders. It's a valid goal for the government to attempt not to undersell the asset, but as you say it's primarily a transfer. If you want an example that changes more than market prices, have a look at Roth's work on matching for hospitals, schools and organs:

https://www.google.co.uk/url?sa=t&source=web&rct=j&url=http:...


Why the eBay one?


Winners curse - everyone who wins an auction on ebay overpays.


eBay sellers overwhelmingly offer items with private value, as opposed to, say, Pork Bellies for resale, so the Winner's Curse does not apply.


Well, game theory can be used to create outcomes: https://en.m.wikipedia.org/wiki/Mechanism_design


The Game Theory has nothing to say about partially observable, stochastic, continuous environments and has no concept of future.


> The Game Theory has nothing to say about partially observable, stochastic, continuous environments and has no concept of future.

This is not true. What makes you think this way?

Game theory studies games with stochastic outcomes, adversaries, and partial observability. Bayesian Nash equilibrium, trembling hand equilibrium and correlated equilibrium are generalizations of Nash equilibrium into these domains. Game theory also extends to continuous environments and sequential games that have concept of past and future.


Whats is the Nash equilibrium of Pong? or Go? or Chess?

All the generalizations you cite are fancy economics methods which loose applicability at the first contact with complex real world scenarios.

Game Theory is like asking question such as "Is chess a win for white? [1]" Sure answering this question might be important theoretically but its of no use when designing an agent that plays and learns.

[1] https://en.wikipedia.org/wiki/Solving_chess


   loose applicability at the first contact
When thinking about complex domains, having a simplistic model is helpful for a variety of reasons, one of them is to establish a shared vocabulary to communicate with others, the other is to study where/why/how the simplistic model fails to deal with the complex domain. This gives an indication for how to go on about dealing with the complex domain, namely by selectively making the simplistic model more complicated until it is sufficiently rich.

One example of the power of simplicity is the habit of theoretical computer scientists to use simplistic devices such as Turing machines as models of computing and to conflate "effective computability" with the complexity class P and polynomial reducibility (in LOGSPACE). We know that this 'looses applicability at the first contact with reality', but it's currently our only approach to have a meaningful theory of computational complexity and has lead to many deep insights already such as interactive proof systems and zero-knowledge proofs.


Hahahahaha reminds me of a saying about spherical cows.

If Game Theory was being used as an "explanation" to show why deep learning outperforms other methods then maybe it would have been fine. But then we already have Statistical Learning Theory and PAC learning which try to explain/prove bounds on Machine Learning models.

There is nothing wrong with having a theory but these days Deep Learning has become a bandwagon that any/everything gets related to. And I just don't see how Game Theory of all mathematical tools is applicable to Deep Learning.

Finally the misguided theory driven approach was exactly the reason why Deep Learning was so controversial initially. Turns out making a theory like SLT and deriving ML models like SVMs is a really bad idea. Deep Learning succeeded because the ML and Vision community adopted empiricism over the fanciest/longest proof with covex loss. So when someone goes around claiming Game Theory as a savior/future of Deep Learning I find it perplexing.


Game theory gave us minimax, which is a major part of any top chess engine.

https://en.m.wikipedia.org/wiki/Minimax


The entire field of game theory grew out of von Neumann's work on analysis of how people play chess[1].

Quote: What should be noted here is that the format of his intended talks follows exactly the development of game theory up to 1928, beginning with Zermelo on chess, and culminating with von Neumann groping for a theory for three and more players: game theory was still the mathematics of parlor games

[1] http://elaine.ihs.ac.at/~blume/leonardjel.pdf


Sure but my point is that the Game Theory approach does not helps us in building better Chess playing agents. E.g. how color is used in photography might make a very interesting paper, but from perspective of building an image recognition model its irrelevant.


Yes, I agree.

Although "distance from Nash equilibrium" can be a useful metric to measure the performance of some (game playing) systems.


I believe an equilibrium would be achieved at a single point of state/time for any of those games you mentioned. The equilibrium being the best next move given what you know about your risk and your opponent.


> The Game Theory has nothing to say about partially observable, stochastic, continuous environments and has no concept of future.

Untrue; game theory has plenty to say about such environments, though that goes well beyond the kind of trivial level of exposure lots of people who are aware of game theory have of it.


This post is in particular about utilizing certain concepts from game theory to build new architectures. For example, it talks about adversarial networks where the problem is reduced to finding Nash equilibria between competing models -- a concept that merges game theory with neural networks.

The post is not talking about making a new paradigm primarily based on game theory concepts.


Game theory does have things to say about observable, stochastic, continuous environments and has a concept of future actually.


I'm guessing a lot of you have read Neuromancer; I liken generative-adversarial networks to be a lot like neuromancer-wintermute networks :)

Imagine Robert Axelrod's iterated artificial intelligence agents tournaments [0], reloaded; let's pit GANs with LSTM cells, deep-q-networks, style-transferring networks, random forest algos, MCMC algorithms, word2vec architectures, etc. against one another, if only to see possible new failure modes of tit-for-tat.

[0] https://en.wikipedia.org/wiki/Tit_for_tat#Background


The second half of your post reads like word salad made from headlines about ML. They're all words, but the fact that you thought they form a thought with semantic content makes me worried. We need fewer MBAs coked out of their mind meddling with this stuff.

The advancements that are interesting in ML mostly come from deep statistics and information theory and very sophisticated analysis, not combining the most popular black boxes that you read a medium post about.


I think you might be over-fitting a bit with your model there


One way to avoid overfitting is to trim spurious interactions Wij -> O. This constraint forces the network to economize on the neural circuitry, thereby driving the system to find an optimal network topology. Hypothetically this should favor certain circuit-design principles. I don't know how much work has been done in artificial neural networks but there has been quite a bit of work gone with artificial gene regulatory networks which utilize the same mathematical model of integrating weighted inputs and passing this through some kind of sigmoidal/logistic function.


If you're curious what this is all about, I'd recommend jumping to this point in a video linked to in the article: https://youtu.be/IbjF5VjniVE?t=42m11s —Yann LeCun will begin saying, "the best idea ever ... is adversarial training."

The article is kind of confusing because it's discussing three different ways game theory has shown up in the context of machine learning recently—but they're all totally separate ideas.


A lot of advances in deep learning have come from providing more data and semi-structured information, but there are a finite amount of fields for which this can apply. Using game theory dynamics can move AI to transfer of domain knowledge and can create more adaptable deep learning architectures. It could also be a pathway to "stronger" forms of AI.


and (3) A way to predict behavior of human participants. The last application can make your skin crawl!

A lot of machine learning is about predicting human participants. Thing like "winning" at ImageNet have devolved into predicting the exact labels humans will put on a subset of the ImageNet dataset (as the "easy" labels are solved and it now comes down to working out what the most humans labelled confusing images as).

If that makes the authors skin crawl, one would have to wonder what they think ML actually is?


Supervised learning - is learning from pairs of input -> desired output.

Unsupervised learning - is learning based on predicting future, present or past data from current data.

Both use immediate feedback.

Reinforcement learning is based on learning from dynamic datasets as opposed from static datasets.

Games are like dynamic datasets. Of course game theory has a lot to do with deep learning. If the signal you are learning is dynamic, it's related to game theory.


Reinforcement learning is not "learning from dynamic datasets as opposed from static datasets."

Reinforcement learning is learning from sequences of states/actions and a reward signal. The sequences could be totally deterministic and similar across different runs of the environment, and hence what I'm guessing you're calling "static" ("static" really has no meaning in the ML field). Or not. Go is totally deterministic, for example, but RL is still applied there.

Unsupervised learning is kind of vague, but it's supervised learning except the "desired output" (label) is generated from the input, or some desired statistic is inferred from the input. Inferring past/present or predicting future data is only a subset of the first case.


In what way does unsupervised learning "use immediate feedback"? It's unique in that it does not use reward or error feedback signals derived from labeled data!


> If the signal you are learning is dynamic, it's related to game theory.

I'm not really familiar with the literature in this area, so sorry if this sounds dumb. But what you're saying here seems to make sense and sounds almost profound - like there's a vast amount of research (game theory) that's immediately related in some way. Is there somewhere I can read more about the relationship?


The references made in the article are probably the best place to start.


I don't think I've ever seen unsupervised learning explained that way. The name is that you are literally not supervising the algorithm, so it is typically not a good future predicting tool.

Now, it can teach you something about your data, but it is then up to you to turn that into something that can let you predict about your data. (So, typically cluster analysis.)

Right?


I think the argument for using physics models for constraints in a DL system was a more clear argument https://www.youtube.com/watch?v=5MdSE-N0bxs


Computing Equilibria in Multi-Player Games - http://theory.stanford.edu/~tim/papers/sym.pdf


the best explanation of this are Ian Goodfellow's talks (i think there are video transcripts online somewhere) or at NIPS this past year. He and a few others also have a new deep learning book just coming out this past month from MIT press


This adversarial approach sounds a lot like evolutionary computation.

The idea of making Artificial Neural Networks compete with each other based on a selection function has been around for many years. I remember learning about it during my undergrad studies 6 years ago.

This just sounds like a more refined version of that. Using RNNs instead of ANNs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: