CMU's Libratus builds substantial lead in Brains vs. AI competition

nopinsight · on Jan 23, 2017

I believe we are witnessing the Cambrian explosion of intelligences. The techniques behind Libratus (abstraction algorithm and game theory [1]) appear to be qualitatively distinct from those behind AlphaGo (deep reinforcement learning and MCMC) and DeepBlue (search and heuristics).

An ecology of Artificial Intelligences, unbounded by our evolutionary history and neural architecture, could evolve to suit each particular task more effectively than our brains can.

Promises and perils abound.

[1] http://www.cs.cmu.edu/~sandholm/ section "Algorithms and complexity of solving games"

echelon · on Jan 23, 2017

Chess, Go, Poker... All feel like variations on the same theme. While it's obvious there is innovation being done, I want to see something more challenging. Something with more dimensionality and integration of several types of input.

How about a machine that can beat someone at Smash Bros, a game with varied characters, complex comboing mechanics, and a nontrivial computer vision task?

Or--more difficult by a few orders of magnitude--what about a robot that can beat someone at tennis? Or a team of robots that can best a professional basketball team?

When do you suppose we'll begin to see these sorts of things? Within our lifetime, I hope?

levthedev · on Jan 23, 2017

Check out https://github.com/vladfi1/Phillip. It's a Super Smash Bros bot that can beat the top players in the world and is run by a tensorflow model that can be created by playing against it for a few days.

logicallee · on Jan 23, 2017

>Something with more dimensionality and integration of several types of input.

like using a few degrees of freedom and little strength to chop onions and some vegetable, crack eggs, whisking them, pour a bit of oil into a pan and lighting a stove, pour the omelette, flip it onto the plate, throw the eggshells in the trash unless it's too full (and take the trash out if it is) and wash the chopping board and pan with a sponge (adding a little dish soap) and not too much water, then rinse them thoroughly but without too much water. Which poses no challenge to most adult humans who take just a bit of time to learn (if you never learned to make an omelette and you're an average adult, by Wednesday you can make a perfect omelette every time.) Even though objectively humans have very weak hands, see things very, very slowly compared with machines, and cannot do any single mechanical action as reliably and predictably as robots.

computers might be able to find and count all the primes between one and a million before I can count the ones up to ten[1], but they can't even scrub my bathtub with a sponge given a whole afternoon to do it - not without a lot of specialized robotics anyway.

[1] https://www.quora.com/How-long-does-the-fastest-algorithm-ta...

Qworg · on Jan 23, 2017

We make up for hand weakness with hand dexterity, slow vision processing with incredibly rich feature sets and powerful cameras, and reliable actions with previsioning and prediction on every task.

The human system is pretty incredible.

logicallee · on Jan 23, 2017

Yes, this is the kind of task computers should "compete with" - using weak and semireliable motor functions and cheap cameras (no infrared or laser vision etc) and making up for it with "smarts" - the way humans do.

thomasahle · on Jan 23, 2017

There's the AI beating a fighter pilot in combat: https://news.ycombinator.com/item?id=11993366 Though it is again a different approach.

your_ai_manager · on Jan 23, 2017

I think this table summarises the increase in complexity when comparing games to non trivial tasks http://alumni.soe.ucsc.edu/~bweber/gamasutra/Tasks.png

TulliusCicero · on Jan 23, 2017

Smash bros shouldn't be too hard. Reflexes alone would give the CPU an enormous advantage.

merlincorey · on Jan 23, 2017

Indeed, from 2015:

http://kotaku.com/overpowered-bot-puts-top-smash-bros-player...

pfisch · on Jan 23, 2017

This is not really a good example. That bot is massively cheating and if you played against him a few times you would find very simple patterns that always beat him.

(I don't even think the base AI can actually see your location in the air very well, and it has no air DI itself so it really could not win for long.)

levthedev · on Jan 23, 2017

This one is a better example - https://github.com/vladfi1/phillip. It does not have such perfect reactions and can be trained on a tensorflow model that you can build in a few hours/days by playing against it. Although it plays weird, it plays a lot more like a human than the other bot. And yes, it is very very good - it can pretty easily beat top 50 players.

xapata · on Jan 23, 2017

Poker is somewhat different due to incomplete information and its highly stochastic nature. It's easy to infer incorrect action-payoff relationships from observing hundreds or even thousands of hands. Like the stock market, the cards can stay irrational longer than you can stay solvent.

andrewprock · on Jan 23, 2017

There are no fundamental algorithmic differences between chess and poker. The incomplete information aspect of it only increases the size of the nodes in a game tree, as each node must now track all possible private states for each public state. Beyond that, there are no fundamental algorithmic differences. There are algorithms that have been devised to handle such large game trees to speed convergence (regret minimization for example), but beyond that it's just a bigger game.

xapata · on Jan 23, 2017

The fundamental difference is the way poker is played repeatedly, tracking winnings over a long period. Winning a single hand is meaningless.

Chess can be modeled with each game serving as a single observation. Poker must consider each player's entire lifetime as a single observation. Not simply a hand, but all hands that player has ever played, including what that player knew about all opponents ever faced. This quantitative increase in data creates a qualitative difference.

Poker is less like chess and more like repeated rock-paper-scissors.

andrewprock · on Jan 23, 2017

This does not apply to solutions which are Nash equilibriums. You are discussing exploitative strategies, something which the CMU team did not attempt to create. They created a game theory optimal strategy.

xapata · on Jan 23, 2017

Optimal strategy might not be a Nash equilibrium. I'm not sure why you think game theory ignores that possibility. The Alberta team wrote some good papers about it.

andrewprock · on Jan 23, 2017

In a two player no-limit hold'em game, a game theory optimal solution will be at a Nash equilibrium.

see: https://en.wikipedia.org/wiki/Nash_equilibrium#Nash.27s_Exis...

xapata · on Jan 24, 2017

You misunderstood "a solution" to mean the only optimal solution. Also, note that Nash equilibrium assumes the opponent does not change strategy. Once you relax that assumption, especially with the idea that you can induce change, another strategy becomes viable.

Check out the "Occurrence" section in the article you linked to.

andrewprock · on Jan 24, 2017

A Nash equilibrium strategy does not assume that an opponents strategy never changes. A Nash equilibrium has the property that if the opponents strategy deviates from a Nash equilibrium, then the opponent will lose.

xapata · on Jan 24, 2017

That's neither my memory of the phrase from school nor my reading of the Wikipedia article.

Nash equilibrium may not exist if one of the players follows, say, a Markov switching process. If that process causes the opponent to stop seeking equilibrium or to settle into a false equilibrium, then the switching process may have been a better strategy than seeking equilibrium.

BaNzounet · on Jan 23, 2017

> what about a robot that can beat someone at tennis?

I think robotic and A.I. are not the same and shouldn't be compare. IMO, the only missing part in beating someone at tennis is having a robot that can compete with any athlete.

levthedev · on Jan 23, 2017

Check out https://github.com/vladfi1/Phillip. It is a Super Smash Bros bot that can beat most anyone and is trained with a tensorflow model that you can build yourself by playing it for a few days.

baq · on Jan 23, 2017

google brain^W deep mind is working on starcraft.

on a related note, hopefully a benevolent gai emerges before human society implodes.

cybertronic · on Jan 23, 2017

Google Deep Mind is working on Starcraft, not Google Brain (a different project)

baq · on Jan 23, 2017

excuse me, fixed. nb. quite confusing.

cgearhart · on Jan 23, 2017

I tend to disagree that this represents an explosion of approaches. For example, AlphaGo and DeepBlue are more similar than they may at first seem. AlphaGo used MCTS (a search algorithm), and value and policy networks as heuristics to evaluate board positions and search order. AlphaGo scored a couple of big wins by showing that you don't need exhaustive search (in fact, AlphaGo evaluates orders of magnitude fewer nodes in the game tree than previous state-of-the art MCTS Go agents, and Go is notorious for such a high branching factor that minimax, et. al., are straight out), and that the heuristic can be learned through reinforcement learning.

chii · on Jan 23, 2017

> Promises and perils abound.

what's the peril? I can only see promise ahead, if AI's progress continue unimpeded.

Certhas · on Jan 23, 2017

Collapse of the current economic and political system as increased consumption can no longer compete with increased efficiencies.

For once. But back when we were thinking about the perils of the internet we didn't correctly anticipate homogenization of media consumption and filter bubbles. People rather thought that it would be the other way around, everyone would have access to high quality information and enjoy diverse (long tail) media content. So we're likely to be wrong about the precise perils here, too.

FeepingCreature · on Jan 23, 2017

Can I recommend Bostrom's Superintelligence: Paths, Dangers, Strategies?

nopinsight · on Jan 24, 2017

Correction: One of AlphaGo's techniques is MCTS (Monte Carlo Tree Search), not MCMC (Markov Chain Monte Carlo).

oculusthrift · on Jan 23, 2017

It's important to note that last time, the poker players beat CMU's AI by around 500k chips as well and they had the gall to declare it a "statistical tie". Yet if they win by chip they will claim to have "won".

smohnot · on Jan 23, 2017

http://motherboard.vice.com/read/a-poker-playing-supercomput... Lost by $732,713, with $170 million bet... pretty close. But you're right, a victory by a chip will be declared a win

dgacmu · on Jan 23, 2017

That's both an unfair slight to the ethics of the researchers involved, as well as inaccurate. They've published in advance their criteria for declaring a victory vs a tie: http://www.cardplayer.com/poker-news/21215-poker-bot-doubles...

which is: "If after 120,000 hands either Libratus or the humans are one standard deviation above break-even, they will have won the competition with “statistical significance.”"

(I'm a professor at CMU, but I have nothing to do with this research or competition.)

oculusthrift · on Jan 23, 2017

we mostly mean by the media. Story if they barely lose "Poker bot statistically ties pros". Story if they are one chip up: "Poker bot leads/beats pros"

andrewprock · on Jan 23, 2017

One standard deviation is not enough to declare that one strategy is better than another.

oculusthrift · on Jan 23, 2017

also important to note they made the players play thousands of hands over weeks and many said how they played in a way to shorten the time they had to play. thereby not necessarily showing their "true" skill.

patrick_haply · on Jan 23, 2017

> also important to note they made the players play thousands of hands over weeks

To your point, I wonder how they account for mental and physical fatigue. To a computer it makes no difference to play thousands of hands over such a long period of time or hundreds of hands over the course of a single day. Humans on the other hand don't have the same attention span as a computer.

oculusthrift · on Jan 23, 2017

not to mention that people play differently with fake money than real money.

thefalcon · on Jan 23, 2017

On the Twitch Stream, Jason Les "promises" that "even though this is not real money" it really matters to him.

angelofm · on Jan 23, 2017

I played poker part time heads up (one against one) previously and the amount of study and analyses players have to do is huge to only get to a reasonable level.

This challenge is very unfair to players so I wouldn't say it won, players have a massive disadvantage, every professional player has tracking software and a database to analyse every decisions that has been made.

This is of course extremely important because you can model your strategy to exploit the suboptimal decisions made by your opponent, yet players have no access to any of these tools, so the bot adjusts it's play based on their human opponents but humans cannot do the same and are left with a guessing game.

If they want to make a proper challenge then players need to have access to the tools they usually use playing in the Internet.

Anderkent · on Jan 23, 2017

>o the bot adjusts it's play based on their human opponents but humans cannot do the same and are left with a guessing game

Do we know if it actually does it? I imagine it's much simpler to build a bot that plays a balanced profitable strategy rather than one that tries to build a model of their opponent and exploit it.

deong · on Jan 23, 2017

It's stated in the article. During a day's play, the AI suggests plays based on its current knowledge. At night, the day's events are fed into the system for it to learn on for the next day.

I assume there are probably papers that specify what form of learning is taking place, but the article didn't go into that level of detail and I haven't tried to track it down.

kriro · on Jan 23, 2017

Trying to play GTO is an interesting (and very profitable) approach but ultimately it's not the most profitable approach. It makes sense for the AI to be constructed that way because the matches it's likely to play are against good opponents. Most profit in poker comes from playing against not so great opponents though. Against those you usually play extremely exploitable on purpose.

That being said, the AI seems pretty impressive. Not sure how they picked the players I could think of a few HUNL players I'd rather see but they might not be interested in a 200k freeroll.

xapata · on Jan 23, 2017

It's tough to decide what skill in poker really means. Is it that the AI can win against a good player or that the AI can earn money from bad players faster than other good players?

lucidrains · on Jan 23, 2017

They should do a Libratus vs Deepstack tournament https://arxiv.org/abs/1701.01724

skoutus · on Jan 23, 2017

I wonder if the players tried colluding: coordinate their bets to fake a weakness so the AI start to adopt a poor strategy, then up the bets and stop the feign. I don't see how the AI can protect itself against that.

hunl · on Jan 23, 2017

The aim of the AI isn't to adopt to poor strategies, rather to play an approximate optimal strategy itself. It's aiming to be unexploitable, the further the other players deviate from optimal, the more it wins. It's EV (expected value) comes from the other players not playing optimally, it doesn't care about exploiting individual weaknesses.

xapata · on Jan 23, 2017

Then I'd say it's not a very good poker player.

hunl · on Jan 23, 2017

If you define a 'not very good' strategy as losing at a maximum of 0, then sure. Playing optimally means the worst case scenario against any opponent would be breaking even. It doesn't have to be trained on individual playing styles, it is simply playing each spot theoretically correctly.

An example, say the humans are getting to a river situation with too many bluffs for a given betsize, an exploit for the AI would be to always call. The opposite is also true, if they are bluffing too little it should always fold. The players notice that the AI has adjusted, and adjust their frequencies - now exploiting the AI. By taking an exploitative approach the AI leaves itself open to be exploited, this is not the goal.

If this were rock paper scissors, the AI is doing the equivalent of always throwing each at 1/3 - even when it's opponent throws rock every time. It could switch to paper, but a thinking opponent will now switch to scissors, this will continue until we are back at equilibrium. The AI aims to play poker in this same fashion, having the correct frequencies of actions for a given range in every spot.

xapata · on Jan 24, 2017

A better AI should be able to fool the opponent into thinking it has thrown rock (metaphorically) so that the opponent throws paper while the AI instead throws scissors.

Poker isn't about equilibrium, it's about misdirection and exploitation. When the table gets cold, you liven it up by convincing everyone to do a round of straddle.

hunl · on Jan 24, 2017

Heads up poker is precisely about equilibrium. Your straddle reference is also irrelevant, this is not live multiway poker.

"Tricking an opponent into thinking it has metaphorically thrown rock" extrapolated into a poker example would be betting larger/smaller, calling more/less, folding more/less than is optimal in a given scenario in the hope that your opponent makes a (bigger) mistake. You're simply hoping he makes more errors than you, the AI instead choses to just make zero mistakes and let the opponents do the rest. You can see this in action for yourself in Heads up limit holdem by playing Cepheus (http://poker-play.srv.ualberta.ca)

xapata · on Jan 24, 2017

You're still thinking one hand at a time. It may be possible to confuse the opponent into permanently shifting strategy.

I agree that would not happen if two equilibrium-seeking computers played each other. Since the human strategy is unknown, it is possible that equilibrium may not exist or be optimal. Even if it's two computers, if one of the computers has the possibility of choosing a non-equilibrium strategy, then again the optimal strategy may not be to seek equilibrium.

skoutus · on Jan 23, 2017

AI's aim is not to adopt poor strategies. It seeks to optimize, but the point is you can lead it to believe a point is global optimum, when in fact it is only local optimum.

As to your point about the EV, this is why collusion can work. By colluding over a long enough horizon, the AI can believe that the average expected value to be something that it is not. If only one individual feign a weakness and the rest do not, then the strategy doesn't work.

hunl · on Jan 23, 2017

You can't lead it to believe a point is an optimum, it's just responded to a bet size/check in isolation given the information it has. If you 'feign weakness' in a given spot it will just respond as optimally as possible to the bet size.

For example attempting to feign weakness by betting small in a spot where your entire range should bet large is not tricking the AI, it's just passing up on EV for the players, good players are not going to play poorly in hope of tricking the bot for future mythical EV gain.

Also, there is no 'colluding' in heads up poker.

erik998 · on Jan 23, 2017

Isn't coordination between players what really happens in most poker games...

skoutus · on Jan 23, 2017

This is a multi-day competition. Players may coordinate in each round, but I mean to coordinate in each day. If you only coordinate over a short horizon then the complexity of your deception is lower. So for example, all players adopt a common feigned weakness on a given day, and let the computer believe those behaviors is part of a pattern to exploit. Then on the second day, up the bets and stop the feign.

This happens in algorithmic trading, where traders would make a large number of low-valued, bad bets to mislead the algo. Then bet big and go the other direction.

skoutus · on Jan 23, 2017

I just looked it up and collusion is illegal in poker, though players could do it.

grizzles · on Jan 23, 2017

Libratus biggest edge is probably grinding away at their blunders & tendencies (eg. Don never check raise bluffs in this spot, so I can safely value bet). It would be really interesting if they published the bots results vs. a mythical generic player. The difference would be a nice estimate as to how big an edge it derives from backtesting it's personalization strategies.

iopq · on Jan 23, 2017

It doesn't use player tendencies, it plays closer to Nash equilibrium than the human players, AFAIK.

grizzles · on Jan 23, 2017

Yes it does. The Libratus bot uses a conterfactual regret minimization algorithm variant of the CMUs teams' own design to calculate endgame strategy. The inputs to that algorithm explicitly takes into account previous player behavior.

iopq · on Jan 24, 2017

It doesn't know who it's playing against, so it accounts for the behavior of ALL players it played against, no?

czzarr · on Jan 23, 2017

That is very unconvincing as 49k hands is really nothing and not enough to iron out variance unless the edge is really big (which doesn't seem to be the case). Any serious poker player will tell you that. They should play 1-10 million hands (depending on the edge) in order to get a decent idea of where this is going.

vannevar · on Jan 23, 2017

Yes, I'm wondering if anyone is tracking the relative strength of each player's hands. In the end, the bot should be declared the "winner" only if its winnings were disproportionately high in relation to the strength of its hands.

tansey · on Jan 23, 2017

The usual way they reduce variance in these man vs. machine poker showdowns is to do "pairs" play. You have two humans playing simultaneously in isolated locations. The decks for both humans are the same, but player 1 and 2 are swapped for one human. That way, the bot strategy has to play both hands.

It does totally eliminate variance, but they also take that into account and correct for it when looking at final outcomes usually. Right now the bot is up by something like 800K over 60K (out of 120K) hands. If that rate continues, it will win by around 1.6M or 400K per human. The blinds are 50/100, so that would equate to roughly 33 millibets (thousandths of a big blind per hand). That isn't too far off from standard win rates in bot vs. bot tournaments [1].

I'd say it's likely that the results of this tournament will be a statistically significant win for the bot.

[1] http://www.computerpokercompetition.org/downloads/competitio...

czzarr · on Jan 23, 2017

That trick doesn't change the variance at all if decks are the same.

Typical winrates in human vs human are between 1-5ptbb/100 where 1ptbb = two big blinds. At 1ptbb the variance is pretty big and north of 1million hands are probably necessary to establish an edge, whereas at 5ptbb the variance is much smaller and 100k hands are usually enough to converge to the expected value

sagivo · on Jan 23, 2017

I would love to know where they got their training data or if there's one publicly available.

osti · on Jan 23, 2017

This is not neural net. It uses an algorithm called counterfactual regret minimization to compute the Nash equilibrium of the game, no data required.

sagivo · on Jan 23, 2017

what about - https://arxiv.org/abs/1701.01724

osti · on Jan 23, 2017

That's a new bot by university of alberta. Funnily enough, even that doesn't require actual data. It uses randomly generated ones.

mrkgnao · on Jan 23, 2017

How would someone with no knowledge of the game learn to play poker? Any good books that stress the mathematical/probabilistic(/game theoretic...?) side of things?

davidivadavid · on Jan 23, 2017

Assuming you at least know the rules of poker, you can probably start with the MIT course on poker theory, available here: https://www.youtube.com/watch?v=OTkq4OsG_Yc&list=PLUl4u3cNGP...

otoburb · on Jan 23, 2017

Poker enthusiasts hang out at the TwoPlusTwo forums[1]. After reading up on combinatorics and learning the rules for numerous poker variants, diving into the forum posts will give you better insight into how players view the game, the situations they get into, and how they learn to analyze and read hands and other players.

[1] http://www.twoplustwo.com/

andrewprock · on Jan 23, 2017

Unless things have changed significantly from the previous version (Claudico), this is not really no limit hold'em, as the stack sizes are reset every hand.

poikniok · on Jan 23, 2017

How is that not no limit?

notahacker · on Jan 23, 2017

One of the distinctive traits of no limit tournament is that on any given hand a player can be knocked out the game with a big enough bet, which affects how players are likely to play (as might one player amassing a large chip lead ove another if neither player opts to go all in on early hands)

Of course it also makes games shorter and introduces a lot more variance, which isn't so good for assessing how well a computer plays.

Jefro118 · on Jan 23, 2017

This isn't a tournament though, it's more like playing a very long cash game where the stacks are reset on each hand. Whoever has the profit at the end of the cash game will be the winner.

andrewprock · on Jan 23, 2017

In no limit, if you lose a hand you have fewer chips, which changes the applicable strategy. There is no version of no limit where every hand is played with the same number of chips.

It is akin to only playing the first 10 moves of a chess game, then resetting.

Jefro118 · on Jan 23, 2017

This is indeed a version of no limit. What defines it as no limit is that there is "no limit" on the bet sizes. The fact that the chips are reset each hand doesn't mean it isn't no limit.

The chess analogy would be more akin to resetting after the flop.

andrewprock · on Jan 23, 2017

On the contrary, no limit is never played where you reset the game after every hand. The fact that this is happening indicates that the strategy is not in fact complete. I expect the strategy would lose a heads up tournament nearly every time.

poikniok · on Jan 23, 2017

I am pretty sure you are trying to level people, nobody can misunderstand a simple concept to this extent.

andrewprock · on Jan 24, 2017

Not at all. Their strategy has almost no practical level in any existing poker game. They've come up with a solution to a variant of poker that no one actually plays.

It's a great solution, and at that stack size, I'm sure it's better than nearly every human competitor. But until they solve all stack sizes down to one big blind, their strategy is practically incomplete.

poikniok · on Jan 24, 2017

You do realize that the deeper the effective stacks are the harder the game is to solve? It is far easier to approach GTO the closer we get to push fold games, and thus your suggestion is akin to acknowledging that they have landed on the moon, but how about they climb this tree over here. If your username suggests you are who you are, this is a bit bizarre to me that an apparently intelligent engineering manager who has played poker before can be so woefully misinformed. If you want to learn more I suggest posting your thoughts on 2+2, they will gladly explain why your thesis makes no sense.

andrewprock · on Jan 24, 2017

Yes, I am who I am, and I've done a little more than "play poker before".

While it is true that solving a smaller stack size is cheaper, you have to solve many stack sizes from 1 to N to get good coverage across the space of all play.

I've been following this work for over 15 years, and they certainly deserve credit for what they did. But what they have done falls short of the banner headline.

product50 · on Jan 23, 2017

Honestly, this doesn't look to be a true test of who is better at poker. Poker is not like chess - a lot of emotions are involved in reading another players' cues (if he is touching his face, whether he is talking nervously, how quickly does he call all in etc.). Making all this computerized seems to take the spirit of poker away. It is like playing online poker which is an entirely different ball game vs. real poker.

utnick · on Jan 23, 2017

I'm not a great poker player, but I read a lot about them. And most great poker players say reading peoples emotions/cues/or tells isn't reliable or a big deal.

So in that sense , online poker is very similar to live poker, the strategy doesn't change very much

xapata · on Jan 23, 2017

Acting is a good strategy against weak opponents. I would never try to "Hollywood" against a good opponent, but then I'd never get involved in a big hand against one either.

Against a fish, yeah, I'll ham it up and they eat that stuff. If they're on the edge of a decision, some good acting can push them the direction you want.

I guess I'm agreeing with you -- a pro would never pay attention to my "tells".

oculusthrift · on Jan 23, 2017

i'm very into poker. you're right about the tells but what is important is knowing the other players. you can tell what type of players and how aggressive or "soft" they tend to be. you know if they would tend to call or raise a weak hand in certain situations.. etc. it definitely matters

usaphp · on Jan 23, 2017

Well a human would know those players by playing against them or watching replays of their games, I am pretty sure AI can do the same and will have a better memory of what are the habits of each individual player it played against or saw their game replays.

phaus · on Jan 23, 2017

Physical tells are almost entirely Hollywood nonsense. When a professional player talks about getting a read on someone, they are talking about probabilities combined with the opponent's position at the table and a history (even a short history) of that player's betting/play patterns.

patrick_haply · on Jan 23, 2017

> Physical tells are almost entirely Hollywood nonsense

That's true for most any level of serious player, but I see it often in casual games in peoples kitchens with casual players who don't play often or are just starting. Even I can't control myself sometimes when the adrenaline starts pumping. I'd rather say lack of self control is just one of the most amateur things you can do in poker, and what's nonsense is the idea that it's a meaningful aspect of any kind of serious poker game.

ProAm · on Jan 23, 2017

This is not true at all. If you play online vs live then yes. But physical tells are absolutely still thing, and Im not talking the way you crack an oreo.

maruhan2 · on Jan 23, 2017

Not a poker expert, but I've seen videos of pro-poker players demonstrating their reading ability with random people picking cards.

newmanships · on Jan 23, 2017

Those physical cues are such a small small part of poker. There's a reason the top players are all online players. Poker is a game of math first and foremost.

asher_ · on Jan 23, 2017

Why is it unfair? Both the human and computer players are unable to make reads on the other based on physical cues.

jat850 · on Jan 23, 2017

If it's unfair in any regard (and I disagree with GP that it's unfair on the merit of physical reads), it's that humans get fatigued and computers do not. Having played poker for stretches of 16+ hours I know that the number of mistakes I make late in the day drastically outnumber the ones I make when I am fresh. And breaks only do so much to counter this fatigue.

I'd have to say that playing poker for long stretches of time is at least as, and possibly more so, than grinding for 12-16 hours writing code.

Even still, it's a stretch to call it unfair. It is brains vs. AI, after all, and that will always be true of brains no matter the task - fatigue is a factor, and this is where AI will have important advantages in the future.

cdelsolar · on Jan 23, 2017

is at least as what?

jat850 · on Jan 23, 2017

At least as tiring (physically, mentally). Like my brain obviously was when I wrote that post :)

keldaris · on Jan 23, 2017

It's not that it's unfair, it's just a considerably different game than human vs. human poker. It would, however, be very interesting to see if the software can reveal exploitable tendencies in the players' behavior that even other top players would be unlikely to discover.

iopq · on Jan 23, 2017

> It's not that it's unfair, it's just a considerably different game than human vs. human poker.

absolutely not, these players play online, there are no tells online

iopq · on Jan 23, 2017

> It is like playing online poker which is an entirely different ball game vs. real poker.

These are online players. They're playing the game they are best at.

kevinalexbrown · on Jan 23, 2017

> Making all this computerized seems to take the spirit of poker away. It is like playing online poker which is an entirely different ball game vs. real poker.

Maybe, but it seems like impressive performance nonetheless.

empath75 · on Jan 23, 2017

Only bad players have readable tells.

LAMike · on Jan 23, 2017

Wonder how long it will be until there is a crowdsourced AI winning the World Series of Poker?

misja111 · on Jan 23, 2017

Very very long. First of all, this bot was capably only of analysing heads up play. At the WSOP there can be maximum 10 players at the table, and with each extra player, the number of combinations to analyse grows exponentially.

Second, in the WSOP it is key to exploit weaker opponents. This bot was able to find almost perfect play against expert opponents but exploiting weak play is a different ballgame, especially if you are facing a mix of strong and weak players at a multi handed table.

36bydesignBL · on Jan 23, 2017

Really long. It will still have to get lucky, or at least avoid being unlucky. This is why amateurs routinely crash the final table instead of just pros.

source99 · on Jan 23, 2017

These are heads up matches. The WSOP is 10 handed tables. I would guess the algorithms' skills may not work quite as well in 10 handed play.

jobigoud · on Jan 23, 2017

Are non-human players allowed to play?

EGreg · on Jan 23, 2017

How good are the BEST poker bots right now at holdem with more than two players?

rhlala · on Jan 23, 2017

Easy with 16 tetra of ram

spectrum1234 · on Jan 23, 2017

And tanking for 1+ mins on turn and river decisions (it would time out in real online poker). And $10M in computer resources. But impressive nonetheless.

iopq · on Jan 23, 2017

There's a time bank online. If the time bank were too short, they could just spend $20M instead to make decisions twice as fast.

xapata · on Jan 23, 2017

Eh, I'll tank for a few minutes in live poker. I don't mind that the computer does too.

eutectic · on Jan 23, 2017

I wonder if it is possible to achieve super-human play purely using reinforcement learning to train deep neural networks.

bossx · on Jan 23, 2017

Is this really fair? Part of Poker is "reading" when other players are bluffing, it's a lot more challenging to read when a computer is...

iopq · on Jan 23, 2017

First of all, that's not true. Most players play several tables at the same time against the same player, up to four heads-up. That's something like 600 hands an hour in HU.

There's little heads-up play in live games, other than at the end of the tournament, and the stack sizes are completely different. These are not tournament players, they're heads-up specialists. They most assuredly play online the majority of the time.

Second of all, the computer doesn't read the human players, why would it matter? It's all up to actual strategies at that point.

hmate9 · on Jan 23, 2017

Not really. Online poker is much more popular. Plus the computer doesn't have a camera either. You have to deduce if your opponent is bluffing through the plays he/it is making.