Google's AlphaZero Beats Stockfish In 100-Game Match

teilo · on Dec 7, 2017

>This would be akin to a robot being given access to thousands of metal bits and parts, but no knowledge of a combustion engine, then it experiments numerous times with every combination possible until it builds a Ferrari.

No, not even close. First, AlphaZero started with the rules of chess, chess pieces, and a chess board. Second, the possible moves are several orders of magnitude fewer than the steps needed to build a working car out of parts.

Closer would be: Here's a car. Here are all the tuneable parameters. Make it as fast or as efficient as possible. But that would still be inordinately more complex then groking chess.

zamalek · on Dec 7, 2017

Nitpicking. It is an analogy. The author said nothing of whether an AI could actually build a car. That's not the point of an analogy. Some of the most renowned physicists use balloons and balls to describe space-time, but space isn't a balloon - does that make their layman explanation bad?

> rules of chess

Just like "AlphaEngine" would have the rules of reality: physics.

> several orders of magnitude

Sharpshooting. Go (the other game that it plays) has more positions than there are atoms in the universe. Not having to evaluate every position is precisely the improvement that AlphaGo brings to the table.

For the record I don't think that AlphaGo could build a car and, even if it somehow could, the universe would likely experience heat death before it arrived at a workable solution.

felippee · on Dec 8, 2017

This is not nitpicking. This is actually quite important. The entire AI hype rides on such irresponsible analogies. These analogies completely obscure the actual state of the art and fuel some outrageous expectations (not to mention the singularity cult).

Nition · on Dec 7, 2017

> Closer would be: Here's a car. Here are all the tuneable parameters. Make it as fast or as efficient as possible. But that would still be inordinately more complex then groking chess.

Reminds me of BoxCar2D from 2011: http://www.boxcar2d.com/

JustAnotherPat · on Dec 7, 2017

>Second, the possible moves are several orders of magnitude fewer than the steps needed to build a working car out of parts.

Aren't the number of chess moves, for practical purposes, near infinite? A car can be made only in so many ways, but a chess game can be won under scenarios that outnumber the atoms in the universe.

allenz · on Dec 7, 2017

Chess has 16 pieces each in one of 65 positions, so the state space is roughly 65^16. A car has more than 16 parts each in more than 65 possible positions.

There are around 100 legal moves for any chess position, and only a handful of rules. The possible actions and rules for building cars is literally all of physics and engineering.

thaumasiotes · on Dec 7, 2017

> Chess has 16 pieces each in one of 65 positions, so the state space is roughly 65^16.

This is a correct upper bound, but the state space is much smaller than that, as the only position that two pieces can share is "captured".

Edit: it's not a correct upper bound (or at least, it may be larger than the state space, but only by coincidence), in that while there are at most 16 pieces, the pawns may be in an officer state rather than a pawn state.

JustAnotherPat · on Dec 7, 2017

I didn't read it that way. You're essentially reading it as the monkeys will eventually write Shakespeare scenario.

The way I read it was, you have a robot and it:

1. knows it has to build a ferrari (implying it knows what it looks like at least)

2. knows nothing about building, parts, etc

3. can engage in some sort of learning behavior

Once a robot knows it can only connect two parts by screwing them together, it will only try to screw them together, and it might even transfer that behavior over to other parts.

The possible scenarios become much more limited as the robot gets smarter.

allenz · on Dec 8, 2017

You're probably imagining an assembly line, where one step follows another in a logical procession, and there doesn't seem to be much of a branching factor. However, each step along that line is the process of many deep optimizations. It is more complicated than you think; thousands of people work on it. Consider just the case of choosing a screw, probably one of the most simple steps in building a car. These 24 pages only scratch the surface of the choices available: http://www.fastecindustrial.com/images/MediaCatFastener.pdf

squeaky-clean · on Dec 7, 2017

> Once a robot knows it can only connect two parts by screwing them together, it will only try to screw them together, and it might even transfer that behavior over to other parts.

Why would learning to screw invalidate other means of connecting things? Why not use an adhesive? Or weld the parts together? Or shape the materials so they can be joined without a separate fastener, like a dovetail or something. Even if it learns those, how does it know there isn't some way of connecting two parts by a chemical means that humans haven't discovered?

goatlover · on Dec 7, 2017

This wouldn't work for a human being who knows nothing about making automobiles, so why would it work for a robot?

personjerry · on Dec 7, 2017

In the same way that the AI’s series of chess moves don’t have to lead anywhere near victory, the system doesn’t have to build anything that resembles a car.

With this consideration, there are many more ways to go wrong in the car scenario.

Chess moves are limited by the chess piece selection and board.

Building something IRL isn’t really limited by anything; For example, you could combine just two pieces in hundreds of ways (tape, staple, weld, etc), in unlimited number of positions (off by 1mm, off by 2mm, off by 3mm, etc).

dx034 · on Dec 7, 2017

Wasn't chess solved early because the number of possible moves within the next x moves is limited? I think that's the main difference to Go where a bruteforce approach doesn't work.

bonzini · on Dec 7, 2017

Chess hasn't been solved; that would mean knowing the perfect strategy, and whether the first or second player can always force a win or a draw.

Examples of games that have been solved include reversi, gomoku, connect four and draughts.

seanwilson · on Dec 7, 2017

I'm surprised how slow the press has been to pick this up. This seems like an amazing step forward to me. AlphaZero played only against itself as training and it beat one of the best chess AIs in the world that has been finely tuned with decades worth of human knowledge.

Now that Go and Chess are efficiently solved for AI...what's next? Are there any other interesting complete information games remaining? What's the next milestone for incomplete information games?

tzs · on Dec 7, 2017

> Now that Go and Chess are efficiently solved for AI...what's next?

I'd like to see an AI that can learn things related to what it already knows with very little training, like humans can do.

Take chess. There's a chess variant that is popular between rounds at tournaments, called "bughouse". It's played by two teams of two, using two chess sets and two clocks. One member of the team plays white on one of the boards, and the other plays black on the other board.

For the most part, the game on each board follows the normal rules of chess. Whichever individual game ends first determines the outcome of the combined game. The teammates may talk to each other and collaborate during the game.

The big difference between the game on each board and a regular game of chess is that on your turn you can, instead of making a move on the board, pick up any piece that your partner has captured, and drop it anywhere on your board (with some restrictions, such as pawn cannot be dropped on the first or eight rank).

If you take a human who has learned to play chess to a certain level, and who has never played (or even heard of) bughouse, and explain the rules to them and then have them start playing it only takes them a little while to get about as good as they are at normal chess.

The human quickly figures out what chess knowledge transfers directly to bughouse, which needs tweaking, and which needs to be thrown out. As far as I know, current AI cannot do that--to it bughouse is a completely new game.

thatcat · on Dec 7, 2017

Transfer learning is the name of this concept. [0]

It seems like one approach is to use markov networks to establish statistical relationships between models.

https://en.wikipedia.org/wiki/Transfer_learning http://www.cs.utexas.edu/users/ai-lab/?Transfer

dkonofalski · on Dec 7, 2017

Does it really matter that it views bughouse as a new game? If it can still learn it in a matter of hours as an entirely new game then, although there's probably efficiency that could be gained from knowledge transfer, I can't really imagine it would need to figure that out until after it's done. I would think that, once it's figured out how to play a few games from scratch, it can check to see what commonalities those have and then apply them to even more games/ideas/processes.

317070 · on Dec 7, 2017

> Are there any other interesting complete information games?

How about axioms of logic as legal moves, and asking it to go from a set of axioms to open mathematical problems?

Or chemical procedures and components as moves, and asking it to tackle a disease with a known molecular structure?

It is not as straightforward as I make it sound, but these are complete information problems.

rhaps0dy · on Dec 7, 2017

>How about axioms of logic as legal moves, and asking it to go from a set of axioms to open mathematical problems?

>Or chemical procedures and components as moves, and asking it to tackle a disease with a known molecular structure?

>It is not as straightforward as I make it sound, but these are complete information problems.

However they are not adversarial games. Self-play only works for adversarial games. Also each mathematical problem is different and needs to be solved only once, and likewise for molecular structures. Additionally, they only need to be solved once, so there is no "gradient of skill" that we know to climb.

fmap · on Dec 7, 2017

Theorem proving in intuitionistic logic is a two-player game and maps perfectly to the kind of Monte-Carlo Tree Search that's employed here. Except that it is far more difficult than Chess/Go/etc., since the branching factor is essentially unbounded.

seanwilson · on Dec 7, 2017

Can you elaborate? How could you score how close you are to a solution and what would the second player be doing?

fmap · on Dec 7, 2017

Consider a formula made up of only conjunctions and disjunctions and true/false. The first player tries to prove the formula and gets to move at every disjunction and is allowed to select which side of the disjunction to prove. The second player tries to prevent the first player from finding a proof and gets to move at every conjunction, selecting a side of the conjunction to descend to. The final state here is an atomic proposition which is either true or false and determines which player won. You derive a value function from that in the same way as you do for Go or Chess.

You can extend this idea to full first-order intuitionistic logic and probably also to higher-order logics, as well as many different modal logics. There are also formulations of classical logic as a single player game, but that doesn't seem to be very useful here.

lukasLansky · on Dec 7, 2017

https://en.wikipedia.org/wiki/Game_semantics

CJefferson · on Dec 7, 2017

Unfortunately the things AlphaZero can do is much more restrictive than that -- it needs a very small, fixed state of the world and set of possible moves.

While chess if very complicated to play, the state can be represented by (at worst) an 8x8x6 boolean array (board, one of 6 possible pieces), for go a 17x17x2 boolean array. There is nothing similar for logic and and deep learning breaks down (in my experience) once you don't have that nice regular input space.

seanwilson · on Dec 7, 2017

> How about axioms of logic as legal moves, and asking it to go from a set of axioms to open mathematical problems?

I'm not sure how you could train it against itself. The branching factor would be infinite as well so I don't know how you'd constrain the legal moves. For example, maybe rewriting (a * 2) to (a + a) or (a * 4 - a * 2) and so on for every theorem you know would be useful in a proof, plus you can invent your own theorems as intermediate steps.

VikingCoder · on Dec 7, 2017

I believe both of those have been done.

WoodenChair · on Dec 7, 2017

>Now that Go and Chess are efficiently solved for AI...what's next?

Chess is not "solved." Solving chess would mean knowing with absolute certainty the best move in any position. It would also mean knowing the best first move. We do not have technology with the computational capability to "solve chess." We have technology that can play chess better than any human, but that's not the same as solving it.

CodesInChaos · on Dec 7, 2017

I'd like to see a game where each player can take a variable number of actions each move. For example, Risk. (Probably not too difficult an enhancement)

I think the best current Poker AI (which can beat good humans) doesn't use neural networks, but expect that combing counterfactual risk minimization with neural networks shouldn't bee too hard.

RTS games should be the next bigger challenge. Not only do they have incomplete information, you also have to handle widely different scales (both space and time) and have a large number of units. I expect this to require an interesting new approach to integrate large scale strategies with small scale tactics without getting stuck in local strategy optima.

username223 · on Dec 7, 2017

> Now that Go and Chess are efficiently solved for AI...what's next?

Cleaning toilets? I would love to see AI doing degrading but useful work. Sadly, all we get is computers playing board games.

EDIT: YC is doing UBI experiments now. How about funding startups building menial labor robots, then passing the savings on to the humans who used to do the work?

joefkelley · on Dec 8, 2017

Menial labor robots will come whether YC funds them or not, so they better work on UBI to prepare for that.

ouid · on Dec 7, 2017

they're better than humans but they aren't solved.

animal531 · on Dec 7, 2017

The interesting thing for me (apart from the very short training period) is that this seems to be a more generalised version of their previous AlphaGo engine, which means that in future it should be even easier to adapt and use for other tasks.

thomasahle · on Dec 7, 2017

There is always Infinite Chess https://en.wikipedia.org/wiki/Infinite_chess and other games with complete information, but unbounded state. (Or even just bounded, but finite, such as Hex.) Monte Carlo + NN seems like a good approach, so maybe it'd be worth a go.

Imperfect information games seem like a much more interesting challenge though.

oneweekwonder · on Dec 7, 2017

Another poster already mentioned deepmind and starcraft2[0]. I want to mention it uses python and is a pip package.

> pip install pysc2

StarCraft BroodWar also has a more mature AI scene[1].

Facebook even tried their hands at a bot this year ending 6/28. Only 2 teams made the top 10, with 15 entries being from independent developers[2].

[0]: https://github.com/deepmind/pysc2

[1]: http://www.starcraftai.com/wiki/Main_Page

[2]: https://www.digitaltrends.com/computing/facebook-cherrypi-ai...

TulliusCicero · on Dec 7, 2017

DeepMind has partnered with Blizzard to work on Starcraft 2 AI. That will probably take a long time to get something that can defeat professionals.

b4lancesh33t · on Dec 7, 2017

Probably. On the other hand, it wasn't five years ago where a majority of randos on the internet thought that AI would never defeat top humans at Go. (Sorry for citing such an unimportant population, but it is the one I'm a part of and whose opinions I have access to.)

d3ckard · on Dec 7, 2017

While I wouldn't exclude possible breakthrough, it's definitely a different case. Go was feared because of an enormous number of possible states, but AlphaGo circumvented the issue by not caring about it. It didn't "solve" Go, it just plays it better than humans.

Starcraft 2 state is much bigger than Go's and requires real time action. Also, actions have to be performed in parallel in real time, which probably requires different techniques and probably puts a way lower bound on number of iterations doable in given time.

graphitezepp · on Dec 7, 2017

I always believed that one of the core issues with Starcraft is considered to be that it is not a perfect information game (in addition to inherent complexity). Having to deal with learning how the opponent plays, and basically guess and make inferences about what they are doing, is somewhat new territory.

JamesBarney · on Dec 7, 2017

Real time and parallelism makes the computers job easier not harder.

d3ckard · on Dec 7, 2017

Yeah, not really. Ask anybody working on distributed systems.

When AlphaZero plays Go with itself, it can basically go as fast as it can calculate next move. With Starcraft that's not the case - both networks have to work in sync, probably need some temporal awareness and probably will have some limit of actions per time fraction, which basically requires a whole new approach. Of course, I can be gravely mistaken, but I would like to now how they can circumvent this.

alphydan · on Dec 7, 2017

Could a similar strategy apply then?

> state is much bigger.

Can AlphaGo not care about it and just play better and faster?

Raidion · on Dec 7, 2017

In SC there are time constraints in that you're getting resources at a certain rate. You have to grow your capacity to pull resources and at the same time be building units to defend yourself. If you allocate resources poorly, you'll find yourself losing. AlphaGo can't ignore that, but...

I think AI does have an advantage once it starts to be competent, especially if it's interacting through APIs exclusively and not the interface, which means that it's actions per minute could be astronomically higher than a human player, with unheard of levels of micro. At the same time, I think machine learning is almost an idea solution to figuring out build orders. It's gonna be fast and smart. Question just is how long?

pythonaut_16 · on Dec 7, 2017

For human vs AI there will have to be APM rate limits otherwise the AI can win with no real strategy[0].

For AI vs AI it could be interesting to see what strategies develop with two opponents with the ability to perfectly microcontrol their units.

[0]Zerglings vs Siege Tanks when controlled by AI with perfect micro.

https://www.youtube.com/watch?v=IKVFZ28ybQs

foobaw · on Dec 7, 2017

I think even with unlimited APM, humans can still beat AI using cheese strategies, like an undetected cannon rush, since you can't really micro your workers against cannons (the projectile isn't dodge-able like the one from siege tanks).

Otherwise, you make a fair point and that video is amazing. AI vs AI strategy with unlimited APM would be very exciting to watch.

tialaramex · on Dec 7, 2017

It is generally assumed that SC and SC2 are not actually just Rock Paper Scissors. That is, you're not obliged to guess your opponent's strategy and counter in the dark but can instead "scout" and figure out what they're doing and overall this can beat a "blind" strategy like cannon rush that doesn't respond to what the opponent's strategy is.

For example the "All ravens, all the time" Terran player Ketrok just responded to the surge in popularity of Cannon Rushes by making a tiny tweak to his opening worker movement. The revised opening spots the Cannon Rush in time to adequately defend and thus of course win.

arcticfox · on Dec 7, 2017

> especially if it's interacting through APIs exclusively and not the interface, which means that it's actions per minute could be astronomically higher than a human player, with unheard of levels of micro

There's no way they'll compete under unlimited APM rules, it wouldn't even be remotely interesting. We're trying to match wits with the AI, not the inertia and momentum of super slow fingers, keys, and mouse.

I'm sure they'll come up with an "effective APM" heuristic which compares similarly to top pros, and feed it as a constraint to the AI.

magicalhippo · on Dec 8, 2017

There's also the fact that an action you trigger now (build unit) doesn't have immediate payout (building takes time). In Chess it can evaluate the current board as is, and it gets immediate feedback on each move.

I'd be interested to see if it can "plan ahead". Maybe a Chess variant where you have to submit your next move before the current player moves, or something like that.

iainmerrick · on Dec 7, 2017

Given the success with Go and chess, I would guess less than a year.

d3ckard · on Dec 7, 2017

It is not a sequential perfect information game. Information is imperfect, actions can be taken by both sides at the same time, there probably will be action limit per time/game frame and the network will have to determine not only its next move, but also manage time for calculating that move. It totally changes the challenge.

As far as I understand (and I am no expert at all), AlphaGo basically creates a heuristic of what move to play in a given situation (which heurisitic is created by playing against itself many, many times). Instead of trying to "break" the game, they just decided to simulate playing and results were good enough to outmatch humans, but we have no idea how close to the "perfect game" AlphaGo actually got.

But - whole input to a network is 19x19 array with 3 possible states per cell, plus maybe turn count and one bit for determining whose next move is. S2 network should process graphic stream (lets say 1280/720), needs spatial awareness(minimap), priority setting and computational resource management. And it has to be fast enough in the first place just to follow the game.

I'm not saying that won't happen (who predicted Go breakthrough?), but it at least seem like a much bigger challenge.

on_and_off · on Dec 7, 2017

It seems that Alpha Go is now aiming Starcraft 2.

It will probably lead to interesting challenges.

I would love to see it tackle my favorite RTS (Supreme Commander) but really, what would be interesting would be to have it attack 'real life' problems :

-logistics

-subway circulation

-city planning

-detect diseases

-optimize the energy efficiency of a building.

(just armchair opinion, I don't know how well suited AZ would be to these problems. I do know that AI has already helped with some of these though)

sanxiyn · on Dec 7, 2017

> Are there any other interesting complete information games?

Not really, given that they already tried Shogi too. Personally I would love to see AlphaZero playing Arimaa, but it is probably not as interesting because of its short history.

Filligree · on Dec 7, 2017

Well, there's math. Mapping it to AZ and finding a decent "win" function might be hard, but it's a complete information something, certainly.

sanxiyn · on Dec 7, 2017

Quick, someone apply AlphaZero algorithm to this MCTS prover for math: https://arxiv.org/abs/1611.05990

I actually prefer "expert iteration" and "tree-policy target" term instead of "AlphaZero algorithm", because they emphasize what is novel about it among the entire system. Terms are from https://arxiv.org/abs/1705.08439

chongli · on Dec 8, 2017

I'd like to see an AI figure out how to beat Zelda for the NES without being given any information specific to the game. This is something pretty much any human can do yet it appears daunting for AI.

king07828 · on Dec 7, 2017

Still need to solve the game model development problem and the I/O problem.

The game models for each Alpha Zero (Chess, Shogi, and Go) look to have been created by a human, as well as the input and output translation (e.g., a human would need to intervene[1] to help AZ Chess to play and win Atari 2600 Video Chess).

[1] Intervene by doing the translation by the human or by writing a program to convert AZ I/O to Atari 2600 I/O.

mulmen · on Dec 7, 2017

That's not why humans play games so why should it be for AI? We didn't consider running "solved" after the first Olympic games. AlphaZero is just better than any other algorithm at this time. A year ago that wasn't the case, in a year it may also not be the case. We learned something about developing AIs here which we can apply to the next. That doesn't mean Chess and Go are of no value to AI researchers, it shows that they are fertile ground for discovery.

RockofStrength · on Dec 7, 2017

One idea is Bongard problems, because they are purely creative categorization, so it's interesting for AI. http://www.foundalis.com/res/bps/bpidx.htm

wglb · on Dec 7, 2017

Neither Chess or Go are solved.

Checkers is solved.

dmitriid · on Dec 7, 2017

I guess Starcraft and other video games, especially procedurally generated games where the ruleset may bi fixed, but the playing field is not.

oneweekwonder · on Dec 7, 2017

> especially procedurally generated games

mmm, this gives me a idea. Do anybody know of a procedurally generated rts.

Something like Starcraft 2; multiplayer action-rts, massive army coordination, with mikro, makro and meta game mechanics. On a randomly generated map?

tudelo · on Dec 7, 2017

Imagine how extremely frustrating it would be to have a bad base position in comparison to your enemy, or your enemy having an island-like terrain when you build all ground units and could not scout quick enough...

oneweekwonder · on Dec 7, 2017

True. But I was thinking in the context of sc where the maps is normally mirrors of each other.

So you will still have a good idea where the guy spawn. But your now forced to scout to see where to take engagements and not.

tudelo · on Dec 7, 2017

I understand what you're saying, but you will also have to realize that SC2 strategies are map and race specific. If you add in randomized maps... it really takes out a dimension of the game. It adds another dimension as well, of course, but that is just randomness.

RepressedEmu · on Dec 7, 2017

Procedurally generated but still symmetrical maps would actually be really interesting. It would be like having a new battleground to explore every match without giving an advantage to either side.

kilburn · on Dec 7, 2017

Symmetry is not the only map characteristic that gives advantages to one race over another. Terran is really good at sieging, Zerg is good at expanding early, Protoss is very bad at defending open bases, etc.. The pro-played maps are carefully designed to balance all that, and when they fail the players themselves limit their strategies according to what works and what doesn't on each map.

The players even get to choose which maps they'll play (by eliminating some map if they feel it's unfair) and in which order...

RepressedEmu · on Dec 8, 2017

I understand the mechanics of the different races(Zerg for life) and you're right that it would completely remove the current "map selection meta" which I would consider as an important strategic part of the series games.

jefft255 · on Dec 7, 2017

C&C Tiberian Sun had that in the 90s. You could tweak a lot of parameters as well!

carc · on Dec 7, 2017

Neither go nor chess are solved

supermdguy · on Dec 7, 2017

VikingCoder · on Dec 7, 2017

Relevant XKCD:

https://xkcd.com/1002/

TulliusCicero · on Dec 7, 2017

I question whether the author has actually played Starcraft. It's clearly vastly more complicated from an AI perspective than games like Chess or Go, and this has been obvious for some time.

forgot-my-pw · on Dec 7, 2017

It is, but these real time strategy games are also vastly difficult for human players. The bar is just to beat human pros, not to play perfectly.

barkingcat · on Dec 7, 2017

Get these AI working on P=NP.

asimpletune · on Dec 7, 2017

I have to say it’s a completely inappropriate comparison to make since Stockfish was forbidden from using its opening book. I would like to see the results when both are at their best.

namelost · on Dec 7, 2017

The opening book was removed from Stockfish ages ago. Do you mean tablebases?

nilkn · on Dec 7, 2017

Another way of looking at it is that it would be inappropriate to give Stockfish an opening book but not AlphaZero, and I'm guessing the latter doesn't have one since that goes against its whole premise.

With that said, whether it's fair or not, I am curious to see how Stockfish with its opening book would compare to AlphaZero in its current state.

Certhas · on Dec 7, 2017

Edit: I just learned that things have changed quite a bit since I looked into computer chess more deeply, and Stockfish does actually not use a book. So they did play the default config. The below is thus completely off.

--

These engines are built for results, not as technical demonstrators. You are testing the engine in a scenario that it was not built to cover. Opening books are not optional add ons for these engines, they mean that the heuristics are not tuned for early game, and no work is put into optimizing the evaluation in that phase of the game.

If they could have beaten Stockfish in its default configuration (using book), they wouldn't have artificially weakened it, right?

arcticfox · on Dec 7, 2017

I'm very curious as well, as there's no way the AlphaZero team hasn't scrimmaged against full-power Stockfish. It seems suspicious that they decided to go with a gimped version.

conistonwater · on Dec 7, 2017

Just as another counterpoint to this, if you look at https://en.chessbase.com/post/the-future-is-here-alphazero-l..., they point out that there are a number of positions where AlphaZero definitively outplayed Stockfish in a non-opening position. There is definitely more going on there than Stockfish not having an opening book.

stouset · on Dec 7, 2017

Stockfish doesn’t have an opening book. Chess engines are good enough now where they play sound openings without any need for direct human input.

thomasahle · on Dec 7, 2017

It plays alright without a book, but far from its best. Deepmind say they chose StockFish 8 because it was the TCEC Champion. In the TCEC all engines are given an opening book + endgame tables. Further, since everybody gives chess engines these tools, engine writers usually don't optimize these aspects very much.

https://en.wikipedia.org/wiki/Top_Chess_Engine_Championship

stouset · on Dec 8, 2017

Stockfish specifically dropped support for opening books since at least 2014, possibly earlier. The code base literally does not have the capability to use one any more. Regardless, in every game that I saw, Stockfish played perfectly sound openings that nobody would criticize it for.

Tablebases are a different story, but in every one of the games they released, where Stockfish lost, it was far behind by its own estimation well before tablebases would have had meaningful utility.

Neither opening books nor tablebases would have helped because Stockfish was handily losing the middlegames, not openings nor endgames.

mtzguido · on Dec 7, 2017

Indeed. Most engines are pretty bad without their opening books. There was surprisingly little stress on this.

foobaw · on Dec 7, 2017

Actually, have you seen matches between SF without opening books vs SF with opening books? This isn't true anymore as far as I'm know.

thomasahle · on Dec 7, 2017

Do you have any links to such matches?

stouset · on Dec 7, 2017

Most engines don’t have opening books any more and haven’t for years.

king07828 · on Dec 7, 2017

Looks like a residual network (ResNet) feeding a Monte Carlo tree search (MCTS) solves the strategy optimization problem.

A critique is that the game model (rules, pieces, movements, legal moves) is still bespoke and painstakingly created by a human being. One next step would be for an algorithm that develops the game model as well as the strategy and the I/O translation. E.g., use Atari 2600 Video Chess frame grabs as the input and the Atari controller as the output. After experimentation the algorithm creates everything: the game model (chess, checkers, shogi, go), the strategy for the game, and the I/O processing needed to effect the strategy with the available inputs and outputs.

afpx · on Dec 7, 2017

Would what you're asking for even be possible for a human? For instance, if I plop my child down in front of a game and tell them to figure it out without telling them the rules, would they be able to figure out the complete set of rules even by trial and error?

nemo1618 · on Dec 7, 2017

Actually, this is a game itself. I used to play it with friends growing up -- we called it "Mao." It's a card game where you are penalized for breaking the rules (at least one person must know the rules in advance). You are allowed to pause the game at any time and guess what a rule is, and the players who know the rules will confirm or deny your guess. You'd be surprised at how quickly people can pick up rules like "only face cards may be played after a 7" or "if an ace is played, the next person is skipped."

kilburn · on Dec 7, 2017

It definitely is. I learned to play Othello and Backgammon by randomly trying things vs the computer as a child. It is also relatively easy to learn card games' rules, even Poker (albeit some very edge-case rules could take a long time to grasp).

The key is that you give the kid a computer version of the game, that doesn't allow rule breaking and makes it clear when a reward or penalty is applied (YOU WIN!).

erikpukinskis · on Dec 7, 2017

Question for the AGI believers out there:

We have this, state of the art, AI which can turn the screws and hone in on some underlying reality about “how to win at Chess”, a formal game. Great.

How does this then extend into the social domain, where AGI would be operating? Like, how does AlphaZero optimize for “how to slow Climate Change”?

I can’t even fathom how it would even understand climate change without an army of scientists publishing new work for it to consume. And then on top of that it will need to understand how it’s adversary, Putin, will try to optimize for the opposite, “ensure global warming to open up our shipping routes and arable land”.

It just seems like a non-starter to me. Saying you could win at chess, so winning at geopolitics is just a scaling problem to me is like saying I can drink a bowl of miso soup so drinking the ocean is just a scaling problem.

It would seem to me that intelligence at the highest levels isn’t constrained by foreknowledge, it’s constrained by the consequences of past decisions made during the (inevitably ongoing) interactive learning phase.

the8472 · on Dec 7, 2017

I'm not sure your question makes sense. The zero-foreknowledge, 100% self-play thing happens in the context of simulated chess games, which translate fairly well into the real world since there isn't much additional interference with external factors. You don't even have to read your opponent's face to win. If you want to extrapolate from there to the case of "how to slow climate change while considering politics" then you would have to simulate the whole game of "playing dressed ape at planetary scale" too.

> It would seem to me that intelligence at the highest levels isn’t constrained by foreknowledge, it’s constrained by the consequences of past decisions made during the (inevitably ongoing) interactive learning phase.

This on the other hand does not seem like a particularly big obstacle. AIs can learn from automated systems such as chatbots, alexa-style gadgets, by observing human interactions, omitting half of the conversation and trying to reconstruct it, etc.

Humans are constrained by a limited lifetime and not being able to parallelize their experience, AIs could gobble up billions of hours of human-human, human-AI interaction and AI-AI interaction data and generate more when needed.

krastanov · on Dec 7, 2017

You are putting the cart before the horse. Neural Networks is just one tool, that is particularly good for fitting a parameterized model to data. Nobody serious is saying that you just need a big neural network and a bunch of data to create a real general AI. On the other hand we can be optimistic about future technologies on the basis of "first principles" (like the existence of the human brain) without being certain how to build those technologies yet.

RivieraKid · on Dec 7, 2017

> Like, how does AlphaZero optimize for “how to slow Climate Change”?

It doesn't, AlphaZero is a domain specific AI. That's like asking: "Like, how does differential equation solver slow climate change?". AlphaZero can learn to play a narrow class of games given the rules, that's all it does.

But I agree with your sentiment obviously. We're n technological breakthroughs away from AGI, where n is unknown. We have approximately zero idea how to move from DNNs to AGI - or weather DNNs are the right approach.

epaga · on Dec 13, 2017

Yeah, seems like weather DNNs might be the right approach for solving Climate Change.

romaniv · on Dec 7, 2017

> Like, how does AlphaZero optimize for “how to slow Climate Change”?

I think that's a bit ambitious. GAI is not obliged to solve all the world's problems. But I would ask a more modest question: how could AlphaZero approach problems like web service integration? These problems are definitely solvable, but converting them to optimization problems is the really difficult part that requires genuine intelligence.

erikpukinskis · on Dec 7, 2017

I thought was the point of AGI though, that it can formulate (formalize?) optimization problems from informal specifications.

DougBTX · on Dec 7, 2017

Wikipedia says:

> Artificial general intelligence (AGI) is the intelligence of a machine that could successfully perform any intellectual task that a human being can.

It is far from clear that humans can answer "how to slow climate change", so it seems like an artificially high bar to set for AGI.

apricot · on Dec 7, 2017

> Like, how does AlphaZero optimize for “how to slow Climate Change”?

Do you want AIs exterminating humans? Because that's how you get AIs exterminating humans.

V-2 · on Dec 8, 2017

So we need to get the question right and down to all the details - how much are we ready to sacrifice eg. (in this case) for environment sake, etc.

I believe every developer has found themselves in a situation where they'd be describing their problem in a StackOverflow question, only to realize half way through what the answer is. Just because they forced themselves to phrase the problem clearly and looked at it from a fresh perspective. That's the "rubber duck" effect at work.

Who knows if we won't bump into this phenomenon pretty often when it comes to high level AI :) Merely setting the whole thing up will often provide such quality research and a solid answer before actually submitting the problem to the AI

scott00 · on Dec 7, 2017

Can any computer chess/stockfish experts comment on the choice of 1 GB for hash size? I have no chess or computer chess domain expertise whatsoever, but it strikes me as a suspiciously low setting for a memory-related parameter on what was probably at least a 32 core machine. It makes me wonder if they simply dialed down the hash size until they got a nice convincing win.

Update: took a look at settings used for TCEC. Looks like they used 16 GB in season 7, 64GB in season 8, 32 GB in season 9, and 16 GB in season 10. Two observations: (1) interesting that they've decreased hash sizes in recent years (2) definitely seems like 1 GB is not reflective of how an engine would be configured for TCEC.

shmageggy · on Dec 7, 2017

It does seem low, but I doubt the effect on performance would be huge. Certainly not enough to affect the major claims of the paper.

For example, on my 4-core machine I just loaded up two instances of Stockfish, one with 32MB hash and one with 512MB, and assigned 2 cores to each. I loaded up a few random middlegame positions, and after analyzing for 1 minute, the evaluations and main lines were generally the same (within the margin of error for repeated runs of the same engine). when analyzing the Kasparov's immortal game, it was a toss-up which engine would find the famous rook sac first.

1GB is probably suboptimal on the hardware they used, but the difference is probably minimal.

joefkelley · on Dec 8, 2017

From what I've read scattered across a few different chess discussion places, the consensus seems to be that a) yes, 1 GB is pretty small b) it would matter, but c) probably not enough to change the end result.

I guess we will have to wait for next year's TCEC for the final verdict though. Hope they change the hardware specs in some way that would allow AlphaZero to compete.

forgot-my-pw · on Dec 7, 2017

Some in the chess community seems to be still in denial phase.

The Go/Baduk community has experienced the similar thing early last year.

- Jan 2016 (AlphaGo beat Fan Hui): Fan Hui was only a 2p and European champion, he's no way near the top

- Mar 2016 (AlphaGo beats Lee Sedol): AlphaGo still lost 1 game. The #1 rank player can probably still beat it

- Jan 2017 (AlphaGo Master beats 60 pros): Ok, AlphaGo is strong, but those are only online games with short time control.

- May 2017 (AlphaGo Master beats #1 ranking, Ke Jie): ...

- Oct 2017 (AlphaGo Zero beats AlphaGo Master): Ok, nothing we have right now can probably beat it.

roenxi · on Dec 7, 2017

Your point is correct, but the thing to note is that until AlphaGo I don't think we've seen an AI improve quite that fast. The AlphaGo Fan had an elo ~3,150 and it was correct that Lee Sedol (elo ~3,500) would win against it.

Google showed up to play Lee Sedol with a ~3,750 elo network. 600 elo in 6 months is basically unprecedented growth in Go AI strength, and took AlphaGo from "it can play a good professional game" to "gee, maybe the No. 1 human can take a game off it".

I'm getting the number from the AlphaGo wiki page.

conistonwater · on Dec 8, 2017

Think low-hanging fruit, I'd say: 600 Elo could be reasonable for a very new idea with lots of improvements still left to try. AlphaGo's improvement definitely petered out later, the difference between Master and Zero is nowhere near that big. On that basis, the skepticism still seems a bit wrong to me.

RivieraKid · on Dec 8, 2017

Denial of what? That one chess engine dethroned the currently best one? This happens every second year. And doesn't make sense why the chess community would be in denial, did you mean chess engine community?

joefkelley · on Dec 8, 2017

The difference is this is an entirely new paradigm of chess engine. So it's not just Stockfish getting an edge over Houdini that year or whatever.

There is plenty of "they didn't give stockfish enough memory, the hardware isn't comparable, the time controls were too short, etc etc" in the chess community so there is certainly some denial. But nobody thinks any human would stand a chance.

But I agree it's nowhere near the Go denial since the chess community gave up on humans remaining competitive with computers long ago.

RivieraKid · on Dec 8, 2017

I see this as scepticism, not denial. Denial implies that you don't want something to be true, which is perhaps the case with engine developers, not the general chess community.

forgot-my-pw · on Dec 8, 2017

I think you're right. Scepticism is more correct.

jsemrau · on Dec 8, 2017

You are absolutely right. We have build machines to augment human capabilities for thousands of years. I don't see an optimized chess engine on the same level of innovation as a racing tire. A great engineering feat and accomplishment, but not entirely groundbreaking.

mg · on Dec 7, 2017

AlphaZero and Stockfish did not run on the same hardware.

So it's not clear if the algorithm is better or the algorithm was just run faster.

animal531 · on Dec 7, 2017

From the pdf: "AlphaZero searches just 80 thousand positions per second in chess...compared to 70 million for Stockfish"

But even that isn't the important part in my opinion. They basically created an ML solution that can learn to play these various games, in very little time (hours), while beating other previous engines (by decent to ridiculous margins). They've created a more generalised solution to their previous AlphaGo engine, which may be very useful in future.

mg · on Dec 7, 2017

'AlphaZero searches just 80 thousand positions per second'

As I understand it, we don't really know what AZ does when it evaluates a position. As it was not explicitly programmed. It could do something that is similar to evaluating more positions.

tmalsburg2 · on Dec 7, 2017

I was assuming that AZ is using a tree search strategy similar to conventional chess engines but with a neural network as a more sophisticated board evaluation function. If true (is it?) you can tell how many positions it evaluates per unit of time.

sanxiyn · on Dec 7, 2017

No, AZ does not use tree search similar to conventional chess engines. That's an actual surprise. Neural network is used for two things: evaluation, yes, but also much more importantly, search selectivity.

In AlphaGo Zero paper, they show that selectivity is so important that playing solely from selectivity (that is, ask neural network which move one should search first, and play that move without searching at all) results in professional level, see Figure 6b. Fan Hui level, not Lee Sedol level, but still.

kilburn · on Dec 7, 2017

That is game-dependent, so we can't be sure it would result in pro-level when playing chess. In fact, it is very possible that it wouldn't because chess has a much smaller branching factor than go (and many more practically forced moves etc.).

Also, changing the heuristic you use to chose candidates (selectivity) doesn't mean you're not doing search anymore!

readams · on Dec 8, 2017

they both use a tree search, though it's a different tree search algorithm.

grondilu · on Dec 7, 2017

Still, when you look at the games, it's hard not to think something genuinely new has happened. Several grandmasters have expressed amazement at the style of play. AlphaZero won some games in a romantic style that's reminiscent of old champions. It's definitely not the kind of play we've been accustomed to with chess engines : AlphaZero seems to rely on a deep strategic understanding of piece placement and dynamism opportunities.

It's difficult to attribute some of these wins to a simple superiority in computing power.

mg · on Dec 7, 2017

Imagine a computer with a simple brute force algorithm but unlimited computing power. It would win 100:0 against AZ.

Would the grandmasters look at the games and say "Yes, it won every time but it's style is rather clumsy"?

joefkelley · on Dec 8, 2017

Hard to say, since we don't know what such perfect play would look like. But I expect it would look quite boring, yes.

It's important to state, though, that calling an engine's play "romantic" is not a value judgement. Stockfish and other engines play in a way that give a clear concrete advantage after X moves, whereas AlphaZero relatively prefers move that are "creative" or "beautiful" in the sense that there's less of a clear definite benefit, but instead some sort of slight positional advantage or more attacking chances or something similar. It's harder to prove that it was a good move, but it just "feels" good. In that sense humans enjoy it either since it's surprising, or maybe closer to how we're able to think.

As an example, there is one game I saw analyzed where AlphaZero gradually gives up more and more material with little clear compensation, but gradually its position gets better and better, until Stockfish is completely out of good moves. It then turns the corner, snatches up more material than it gave up, and converts into a winning end-game. Such imbalanced style of play is much more interesting to watch than previous computer slogs of slowly jockying for slight advantages until one side can grab a pawn without giving up much and build into a win from there.

grondilu · on Dec 7, 2017

There is no unlimited computing power anywhere, though. Not even in Google's quarters. The style difference is so profound that it is difficult to impute computing power only, because we know what kind of improvement reasonable computing power gives : it makes the engine stronger but does not quite change its style.

espadrine · on Dec 7, 2017

> AlphaZero won some games in a romantic style

The difference in style is likely influenced by the insertion of historical boards as input to the neural network.

The sequence of moves are therefore more likely to look related to one another.

grondilu · on Dec 7, 2017

The neural network was only fed with games from self-play. No historical game whatsoever was given. At least that's Deepmind's claim.

espadrine · on Dec 8, 2017

I was misunderstood. By historical, I mean, it includes the past N boards, which pushes the network to make actions correlated to the previous actions performed.

That is similar to how humans play.

edraferi · on Dec 7, 2017

Also, Stockfish was denied some initialization data that it usually uses:

The player with most strident objections to the conditions of the match was GM Hikaru Nakamura. While a heated discussion is taking place online about processing power of the two sides, Nakamura thought that was a secondary issue.

The American called the match "dishonest" and pointed out that Stockfish's methodology requires it to have an openings book for optimal performance. While he doesn't think the ultimate winner would have changed, Nakamura thought the size of the winning score would be mitigated.

sanxiyn · on Dec 7, 2017

While I somewhat sympathize with Nakamura in that in case of AlphaGo vs Lee Sedol, Lee certainly had an "opening book", I disagree with your characterization that opening book is "usual". TCEC is widely recognized competition for chess engines and TCEC rule specifies no opening book. For engine-to-engine evaluation, no opening book is the usual method of evaluation.

One problem is that in a sense AlphaZero has an "opening book" encoded in its neural network weights. But just like it is unclear how to construct "Lee Sedol without an opening book" at all, it is unclear how to construct "AlphaZero without an opening book in such sense". So indeed, while unusual for engine evaluation, it probably is best to play against Stockfish with an opening book.

sanderjd · on Dec 7, 2017

Seems like an easy answer to this would be to just let Stockfish use its opening book?

sanxiyn · on Dec 7, 2017

That's the problem. Stockfish has no opening book.

While there is no doubt Stockfish can play stronger with good opening book, keeping a book up to date with engine changes is really a full time job. So Stockfish project does not have any official opening book.

Personally, I think using publicly available Komodo book would have been enough, but obviously Komodo book is tuned for Komodo and everybody would complain about any book problem. In a sense, "no opening book" is the official upstream supported configuration, so it is entirely a defensible choice.

sanderjd · on Dec 8, 2017

Fair enough (and thanks for dropping that knowledge bomb!), but it seems like they could choose whatever seems like the strongest configuration at the time. It seems like any opening book would make a pretty huge difference.

seanwilson · on Dec 7, 2017

Given the algorithm didn't require tuning and it examined less moves each turn I would say it was a better algorithm.

Lots of AI problems aren't solved with current techniques just by throwing more CPU power at them so it would still be impressive even if it required a lot of computing power.

dzdt · on Dec 7, 2017

The thing it does require is specialized hardware (TPU) to run at a reasonable speed.

pixl97 · on Dec 7, 2017

A TPU isn't specialized hardware in the same way a CPU or GPU are not task specialized. All three are generic execution platforms that are optimized for a particular type of processing, but do so in a generalized fashion that can complete many different types of tasks.

partycoder · on Dec 7, 2017

You will see here how Stockfish improvements are tested.

http://tests.stockfishchess.org/tests

Some code or weight is changed, then they have it play to see if it leads to better performance.

An exhaustive, sort of manual work. On the other hand Deepmind's bot is fully automated and they have it just running day and night improving itself on a large hardware configuration.

sanxiyn · on Dec 7, 2017

Yesterday's discussion here: https://news.ycombinator.com/item?id=15858197

cyberferret · on Dec 7, 2017

Amazing. I assume AlphaZero knew the basic moves of the pieces, but had to figure out defensive moves etc. against the other computer 'on the fly'? Are those learning games included in the statistics (which include ZERO losses)?? If so, it is a remarkable learning engine.

Shades of WOPR. "This is a game nobody can win..."

317070 · on Dec 7, 2017

Yes, it got a set of all legal moves, of which it had to choose one. It was trained for 8 hours where it only played against itself, only using the knowledge of which moves are legal. After self-playing for 8 hours, it played 100 times against stockfish. In those 8 hours, it was apparently able to infer all human knowledge on chess acquired over centuries, and surpass it.

And, when the legal moves are replaced with the moves of other games (like Go for which it was actually written, or Shogi), it did exactly the same thing. Redevelop millennia of human knowledge and go beyond, all within 8 hours of compute.

Makes you wonder what will happen when instead of the rules of chess, you put in the axioms of logic and natural numbers. And give it 8 months of compute.

chx · on Dec 7, 2017

> Makes you wonder what will happen when instead of the rules of chess, you put in the axioms of logic and natural numbers. And give it 8 months of compute.

How do you score this computation? What's your goal? There's no checkmate here.

criddell · on Dec 7, 2017

I would assume you have to come up with some scoring function.

For example, give it a number and tell it to factor it into the prime factors. The score might be the solution with the smallest time or storage requirements. Or maybe find a way to generate a hash where the last n digits are zero.

I also remember a RadioLab episode where they talked about electronic theorem solvers and they would do something like show it a video of a double pendulum and it would come up with equations to model the behavior of that system.

thomasahle · on Dec 7, 2017

> How do you score this computation? What's your goal?

Maybe give it a list of problems of varying degree of hardness, and tell it to either prove or disprove as many as possible?

Will_Parker · on Dec 8, 2017

You could even train it with random theorem expressions run through a conventional theorem prover. And then let it loose on the problems that are left.

mortenjorck · on Dec 7, 2017

I think the idea here is for life to imitate art.

maxerickson · on Dec 7, 2017

It's inferred a different chess knowledge though, not necessarily including all human knowledge on chess.

soVeryTired · on Dec 7, 2017

Amazing achievement. Still, "eight hours" is a bit misleading. That's eight hours on thousands of parallel TPUs.

Symmetry · on Dec 7, 2017

I haven't seen any details from AlphaZero, but when they were training AlphaGo Zero they drastically reduced the amount of hardware and only used a single node taking about a killowatt of power. If we assume AlphaZero uses the same hardware, then given that a brain uses about 10w that would be equivalent to 800 hours. Which is still amazing.

317070 · on Dec 7, 2017

Let me reformulate. I'm just saying that there is at least one obvious way in which Google could scale this at least 2 to 3 orders of magnitude if they would want to, for the cost of the electricity bill. Just run it for 8 months instead of 8 hours. And I wonder which problems are feasible when you look at it that way. I mean, they just solved chess in what looks like a warming-up.

pixl97 · on Dec 7, 2017

>That's eight hours on thousands of parallel TPUs.

But it's not really misleading at all. The ability to produce massive amount of power and product is the pinnacle of our society. The fact is we can make thousand, if not millions or even billions of TPUs if we desired, and it is a relatively easy engineering problem at that. And that these things may solve all kinds of problems mankind has had for millennia in hours should be a wakeup call to a future that will be hard to predict.

Humans take 9 months to gestate, no amount of parallelism will speed that up, after that it takes 18 years for them not to be completely stupid all the time attempting to hammer an education in them. Even after that it takes more years to become a specialist.

wereHamster · on Dec 7, 2017

More interesting would be how many games it has played in that time.

andrioni · on Dec 7, 2017

44 million, according to their paper, and they used 5000 TPUs, which are capable of 4.6×10^17 operations per second.

(The operations the TPU can run are far simpler than what supercomputers can do, but just for the sake of comparison, the current top supercomputer in the world can do 1.25×10^17 floating point operations per second)

seanwilson · on Dec 7, 2017

> Makes you wonder what will happen when instead of the rules of chess, you put in the axioms of logic and natural numbers.

If you're talking about formal proofs or maths, I'm not sure how this would apply in general as the branching factor for each 'move' in a proof is efficiently infinite. It would be interesting to see it applied to more constrained proof domains though.

animal531 · on Dec 7, 2017

Seems like it learned by only playing against itself, and for only 4 hours (for the chess component).

jtraffic · on Dec 7, 2017

> Nielsen is eager to see what other disciplines will be refined or mastered by this type of learning

> of course it goes so much further

> The ramifications for such an inventive way of learning are of course not limited to games.

>But obviously the implications are wonderful far beyond chess and other games. The ability of a machine to replicate and surpass centuries of human knowledge in complex closed systems is a world-changing tool

Okay then. Let's go beyond games already!

rothron · on Dec 7, 2017

While interesting, it's comparing 5000+ Tensor Processing Units against 64 CPU threads. I suspect this isn't a fair comparison by watts spent.

bazzargh · on Dec 7, 2017

No - 5000 TPUs were used for training, but the 100-game matches used just 4 TPUs (bottom of p4 in the paper)

sls · on Dec 7, 2017

from the paper:

> Training proceeded [...] using 5,000 first-generation TPUs

> We evaluated the fully trained instances of AlphaZero against Stockfish, Elmo and the previous version of AlphaGo Zero [...]. AlphaZero and the previous AlphaGo Zero used a single machine with 4 TPUs.

So the 5000 TPUs processing power or energy consumption should be compared with those spent in devising Stockfish, not running it.

otaviokz · on Dec 7, 2017

>This would be akin to a robot being given access to thousands of metal bits and parts, but no knowledge of a combustion engine, then it experiments numerous times with every combination possible until it builds a Ferrari. That's all in less time that it takes to watch the "Lord of the Rings" trilogy. The program had four hours to play itself many, many times, thereby becoming its own teacher.

This absurd comparison would raise my eyebrows coming from an English tabloid.

Having said that I've looked the author's profile and was appalled to learn he's a chess prodigy. Then I also seen he's a Chess Journalist. Apparently he became much more a journalist than a chess master...

romaniv · on Dec 7, 2017

What I'm reading: an existing chess engine that runs on much poorer hardware and for some weird reason was deprived of its usual initialization data achieved 73% draw rate against a ridiculously hyped "deep" neural network/MCTS algorithm.

It's interesting that AlphaZero was finally applied to a different game, though. I wonder what architectural changes they had to make. I've read that pure MCTS isn't that good at playing Chess. How true is that?

qeternity · on Dec 7, 2017

AlphaGo and AlphaGo Zero are different than Alpha Zero.

red75prime · on Dec 8, 2017

All the hardware you can throw at Stockfish probably will not do much of a difference, as DeepMind will just train AlphaZero for another couple of hours to surpass it.

Draws are common in high-level chess. There's a belief that perfect play by both sides will result in a draw.

They changed input format and amount of noise (which facilitates exploration) to account for different branching factor.

Symmetry · on Dec 7, 2017

I'd be curious about how it's strength relative to Stockfish might change as the amount of time per move is varied.

imrehg · on Dec 7, 2017

There's a graph of that in the Arxiv paper submitted yesterday: https://news.ycombinator.com/item?id=15858197 (page 7)

As it looks, for move times < 0.2s, Stockfish is stronger, anything above that, AlphaZero is stronger.

heliumcraft · on Dec 7, 2017

You can see that on page 7 of the paper https://arxiv.org/pdf/1712.01815.pdf

margorczynski · on Dec 7, 2017

Personally I was always interested if it's possible now to use them (NN, DL, etc.) to infer theorems on it's own. Because if the difference between it and a human would be as big as we see in these expert-systems (trained only to do one thing) then it could provide amazing results.

CGamesPlay · on Dec 7, 2017

Did DeepMind publish anything about this? Is this literally a straightforward plug of the AlphaGo Zero techniques into a chessboard with no novelty? Don't get me wrong, I'm impressed, I'm just looking for a more primary source.

wrsh07 · on Dec 7, 2017

The arxiv paper is here: https://arxiv.org/abs/1712.01815

ouid · on Dec 7, 2017

I think the real measure of an AIs success in this field is the absence of pathological boards like this one.

https://lichess.org/analysis/8/p7/kpn5/qrpPb3/rpP2b2/pP4Q1/P...

Is it still easy to find positions that alphazero totally misunderstands?

tialaramex · on Dec 7, 2017

Such positions are not interesting to Alpha unless it might run into them. If Alpha would never choose moves that lead to this position, it needn't have any insight into them.

If a Hold'em AI would never choose to bet 72 off it doesn't need to have an opinion of what to do when that bet is raised by the opponent.

ouid · on Dec 14, 2017

Sorry, I missed this response. The goal, to me, for a chess playing AI, is not that it be very effective at the game. We have already shown that simple algorithms exist which are better than humans. The novelty presented by Alphazero is the generalization of positional evaluation with deep learning structures. If you present Alphazero with this board, and ask it to learn how to play starting from this position, what does it discover? Does that translate to other "similar" pathological boards?

megaman821 · on Dec 7, 2017

Does anyone know how well a system like AlphaZero can be applied to a field like material science? It would seem that you could make a scoring function against how well the material meets the desired criteria.

grizzles · on Dec 7, 2017

Pretty well. There are several active research groups working with AI/ML for materials science research. Since you asked about AlphaZero, here is an article directly about the DeepMind guys working on this problem: https://qz.com/1110469/if-deepmind-is-going-to-find-the-next...

skarist · on Dec 10, 2017

This is of course all very, very impressive, but it would be great to see more details on this. We are told AZ only started with the basic rules. What was included in the "basic rules"? How were they codified? The engine looks at 80.000 positions at second, so obviously it has some evaluation function. What is the position evaluation function? Presumably it was codified in some way in the beginning, and then got improved by the training period? It would be very interesting to see the first 100 games, or so, the engine it played against itself.

ken47 · on Dec 7, 2017

If AZ is truly superior to Stockfish, why was Stockfish given an amount of RAM that could only be considered standard for the 1990s?

logicallee · on Dec 8, 2017

Completely dishonest title and first 1,000 words of the write-up. That's when you get to the words:

> pointed out that Stockfish's methodology requires it to have an openings book for optimal performance.

I went from amazed, utter shock like "What!! No way. This is unreal. This is absolutely unreal. What? What? What?" to a total feeling that I've been reading 1,000 words of fake news.

I feel cheated by this write-up and flagged it for this reason. They need to mention it in the first couple of words, not after selling that it has deduced 1600 years of human chess knowledge in 4 hours.

grandalf · on Dec 7, 2017

It's interesting to consider what it means when the AI can succeed without using brute force.

Suppose at every turn there are n possible future states of the game based on the rules. To avoid "brute force" the AI must be able to ignore many of those states as irrelevant. In effect, the AI is learning what to pay attention to, not just considering what might happen, thereby conserving computational resources.

Chess and Go are interesting for two nearly opposite reasons: 1) because they are too large for humans to consider the reasoning obvious, and 2) because the input to the reasoning is simply a small (and easily perceived by humans) grid of rule-constrained pieces.

But when you think of AI in an information theoretic way, so that given representative training data the system (if large enough) will always "learn" perfectly, it's not really all that remarkable. It's just a different computational way of doing the same transformation from input states to moves. Given a problem (chess, go, etc.) the researchers must simply learn what network structure and training regimen will do the job with the least computational cost.

To see why this is relevant, consider a deep learning model that could continually generate successive digits of pi (or primes) without having the concept baked in already. Would the result be computationally cheaper than highly optimized brute force algorithm? No, because what it would "learn" would be something already known by humans. Perfect chess is simply a function from input states to moves that humans do not already know the definition of. Most humans do know the definition of this function for the game of tic tac toe by the time they reach middle school.

I'd argue that while this is useful it's ultimately not hard. Comparing it with Stockfish mainly demonstrates how chess is hard for humans to reason about and hence hard for humans to write non-brute-force algorithms to solve.

Thus, I think this is an example of "weak AI" even though humans associate chess with high degrees of exceptional human cognition. Chess data contains no noise, so the algorithm is dealing only with signals of varying degrees of utility.

I'm looking forward to AI that can be useful in the midst of lots of noise, such as AI that analyzes peoples' body language to predict interesting things about them, analyzes speech in real time for deception, roulette wheels for biases, and office environments for emotional toxicity.

Chess is interesting because we can't introspect to understand what makes humans good at chess (other than practice). So many human insights and intuitions are similarly opaque yet the data is noisy enough that it will take significantly better AI to be able to do anything that truly seems super-human.

reader5000 · on Dec 7, 2017

Number of humans that could put together a Ferrari from parts: ~10000?

Number of humans that can beat Stockfish: 0

dang · on Dec 7, 2017

Please don't post unsubstantive baity comments here. It leads to low-quality subthreads.

We detached this subthread from https://news.ycombinator.com/item?id=15870384 and marked it off-topic.

reader5000 · on Dec 8, 2017

Please explain how this comment was unsubstantive and baity.

albertgoeswoof · on Dec 7, 2017

Number of cats that can meow: ~1000000000000?

Number of cats that can beat Stockfish: 0

bluecalm · on Dec 7, 2017

And therefore meowing is simpler than beating Stockfish which was parent's point.

soVeryTired · on Dec 7, 2017

A trillion cats meowing at once would obviously destroy the earth.

RcouF1uZ4gsC · on Dec 7, 2017

Number of humans that can leverage computers to beat Stockfish: At least the number of humans on the Deep mind team.

Number of humans that can create AlphaZero:At least the number of humans on the Deep mind team.

Number of computers that can independently create AlphaZero: 0

periya · on Dec 7, 2017

Give it 10 years maybe.

erikpukinskis · on Dec 7, 2017

Where does this estimate come from? You’re talking about an event with no precedence, nothing even close.

Bascically a function winAt(“chess”) and another function with no knowledge of chess that can implement the first.

What are you modeling that suggests something like that in 10 years?

teilo · on Dec 7, 2017

Try again with: Number of humans that could assemble a Ferrari from discrete parts: who have never seen a car before, have no training to build or maintain a car, and no instructions other than "car must work."

rurban · on Dec 7, 2017

But now please Houdini and the rest. Stockfish doesn't play interesting, and has less ELO than Houdini. Though that new search will beat Houdini also I guess.

cerealbad · on Dec 7, 2017

seems like the real untapped potential goldmine here is to carefully observe millions of workers, notice when they are working and why and when they are slacking off- and enforce penalties and rewards relative to their peak performance to motivate increased productivity. measure average key presses, response times, eye engagement, fidgeting and body motion, facial expressions. do you really need robots if you can train the human network to be more robot like? the 21st century assembly line is so delicious, think of the possibilities.

going to need stimulants to work at 100% or your company AI will cut your pay.

blondie9x · on Dec 7, 2017

Four hours to learn on a high speed computer is what millions upon millions of games? Thousands of human lifetimes lived out in four hours. The four hours to learn thing is fake news and distorts the reality of how many games and trials the machine actually went through.

kazagistar · on Dec 7, 2017

I don't think so. It seems to me that the whole point of AI is how many real world human years it saves by mastering and executing things fast. The fact that it can learn this well at all, after any amount of time, makes it impressive, and the fact that it learns quickly makes it all the more useful.

mikeash · on Dec 7, 2017

I disagree. I think people understand that computers are fast, so millions of games is implied in "four hours." The fact that it only took that much time, starting from nothing, is still incredibly impressive.

pixl97 · on Dec 7, 2017

How fast can you exchange and learn information? How fast can you figure things out and then teach them to other humans around you?

Your processes don't scale well, beyond the biological scaling that your body has developed via evolution in the last 4 billion years. Digital processes do.

nickjj · on Dec 7, 2017

Another really impressive feat is Elon Musk's OpenAI which defeated a number of world class dota2 players in a 1v1 match.

This is a real time strategy video game. The number of decisions you need to make in this game are mind boggling. It takes most people many months, if not longer just to get to the point where you're not clueless.

A recap video is at: https://www.youtube.com/watch?v=jAu1ZsTCA64

TulliusCicero · on Dec 7, 2017

This is mostly meaningless. Of course computers are better at the twitch/mechanical aspects of video games. It's trivial to make an AI that's perfect at Street Fighter or Counterstrike, for example.

YawningAngel · on Dec 7, 2017

OpenAI isn't a good DotA player though, it's just very mechanically capable at this tiny subset of the game. Still impressive but not really DotA.

Moncefmd · on Dec 7, 2017

Still, they heavily limited the scope of actions of both the player and the agent. It's not like the research space consisted of all the items/actions available in Dota 2.

erk__ · on Dec 7, 2017

On the other hand it quickly got figured out how to beat it. And it could only play one hero.

nickjj · on Dec 7, 2017

But it still beat a bunch of top players.

I've played dota-like games in the past (at a medium-high level). The skill gap in this game is remarkably huge.

It's a type of game where a new human player would get absolutely destroyed with a 0% chance of possibly ever winning, even in a minimal scenario where only 1 hero were involved.