Hacker News new | past | comments | ask | show | jobs | submit login
What we learned in Seoul with AlphaGo (googleblog.blogspot.com)
259 points by Eldorado on March 16, 2016 | hide | past | favorite | 74 comments



I think that the event will help to boost investors confidence, and hence investment in AI research. Which is great, because it means accelerated pace of innovation. It is a huge win for the AI field. AlphaGo showed that a special arrangement of neural networks with additional algorithms can deliver spectacular results. They can now try more sophisticated arrangements of neural networks, to achieve more ambitious results. I hope that we will now start to see more creative attempts at AI. The time to finally reach to general AI, will depend on the magnitude of the research effort that is put on. This kind of highly publicized events are very positive.

AlphaGo was a combination of neural networks with a tree search algorithm. I think that very interesting things could be achieved, combining neural networks with basic knowledge representation systems -symbolic AI-. These techniques are highly discredited for under-delivering in the past. But I think that it is a good moment to revisit some of these past techniques, and combine them with the more recent techniques of neural networks. It could be an interesting base to attach neural networks. And then submerge the AI on a basic simulated world. Something like minecraft, perhaps, as Microsoft announced that is going to do. I think that giving the neural networks some basic structure to depend on, can help to achieve results more easily.

In any case, I hope that high profile AI events like AlphaGo victory continue happening, to help to further increase AI research.


This result came out of a large team that Google paid a lot of money for. Google Brain has been kicking ass internally solving really hard problems like realtime translation or ranking. Most major companies from Toyota to Baidu to IBM have research money focused on AI. Industry and investment is already focused on AI.

Relative to other research fields there is not a shortage of investment in AI, and despite the difficult market you see recent big investments in companies like Cruise or OpenAI: https://www.oreilly.com/ideas/the-current-state-of-machine-i...


As a programming language specialist, I am for all things symbolic too.

http://leanprover.github.io/presentations/20150717_CICM/#/se... The last slide hints at proof assistent + ML. I wish I knew more of what they are up to!



I'm not sure if I'd like to see AI in a compiler backend; development might become unpredictable (in terms of program efficiency).


My guess is the machine learning is some sort of tactic. So it communicates with Lean to generate some code (program/proof), but ultimately Lean itself decides whether the code is correct---this is no "polluted backend".


It kind of already is.


It's awesome that they won.

I recently read the paper, and there are a couple of things you need to keep in mind to understand the scope and how general the result is.

They were using a big cluster to do a brute-force tree search (not brute-force as in exhaustive, but still brute-force as in let's throw lots of hardware at this). According to the paper, this tree search was important in improving the play.

Basically they were using a combination of approaches, like the winners of the Netflix competition a couple of years ago, where each approach in its own was pretty good, but not on the level of Sedol.

The other thing is that this was bootstrapped using a gigantic database of human plays. It's not clear to me that they could have ever achieved what they did without this. Once they trained the neural networks up to the level of an expert player, they could make it play against itself and learn some extra things. But the question is how far this takes you? How much can an AI or a human learn by only playing with itself?

Clearly, it's not yet god-like, since Sedol managed to beat it by a move it wasn't really considering. It's not clear to me how you would improve what they have now, without adding yet another approach, like the Netflix competition where the mixed models got better by the sheer number of them.


> It's not clear to me how you would improve what they have now

They could improve the policy network which was based off 100,000 amateur level games. Now they could use AlphaGo self play games which are at the level of 9p as a training set.

Another thing they could do is let it run more self play games in order to improve the value net even more.


It's not yet god-like, but if you were to attach a couple nuclear warheads to the game engine, then at least it would be badass. If the machine feels insulted by a particular move, or just gets tired of the game, bye bye Seoul!


Recursive improvement.... it plays and learns from itself.


Congratulations to everyone involved in this tremendous achievement, both the engineers who created AlphaGo and the Go community for being such gracious hosts.


"while the match has been widely billed as "man vs. machine," AlphaGo is really a human achievement."

I think this is very true, and we often forget this in our rush to praise "the machines". We built them! They aren't a being, they are tools we built.


> We founded DeepMind in 2010 to create general-purpose artificial intelligence (AI) that can learn on its own—and, eventually, be used as a tool to help society solve some of its biggest and most pressing problems, from climate change to disease diagnosis.

This is really interesting to hear and absolutely fantastic news. I've often wondered if we might be able to find better (cheaper, faster, more effective) solutions to global problems like climate change by using AI. It's a thrill to hear that's actually in Google's plans.


They announced some healthcare products last month, though they won't use AI at first. It's safe to say that's part of their plans, however.

https://deepmind.com/health


While this is an interesting breakthrough for AI, let's remember that Go is only limitless in the exhaustive possibilities for every response move in sequence for the game. So many moves in response to a move are just not good and invalid you can do extensive pruning of search trees rather easily. That's why monte carlo algorithms work so well as it is. They don't beat the best of the best but I mean they still beat like literally 90% of Go Players.

But the game's rules are simple, the total storage of state needed for a game is rather small, so in terms of utilizing training data it's quite easy. You can fit an entire Go game in like 2 kilobytes of memory. There's only two types of stones, and they have no meaning other than being different than each other.

We would need more breakthroughs for real-time games. Compared to Go, they have easily gigabytes of potentially meaningful data PER game. Not only is storing millions and millions of games becoming an issue, processing them is becoming an issue as well. And in many of those games when strategy changes depending on which character or map you're on, it quickly balloons out in a manner that is just not reasonable.

Maybe the next step will be specialized AIs that tackle small more easily calculable components of real-time games and then combining those together to make something decent.


The same company, DeepMind, already published about reinforcement learners reaching superhuman performance on many realtime games -- old Atari ones.

I know there's a lot of hype and ignorant confident opinions being expressed, but this sort of response seems really strange to me: in the last decade we went from nobody having a clue how we might automate Go in this lifetime, to 2014 when an article on prospects for Go by a premier researcher on game AI (https://news.ycombinator.com/item?id=11290112) did not even mention neural nets, to professional play last fall, to 5 months later crushing a top player and making high-level innovations, in a game said to be among the most deep and beautiful ever invented... and the most salient points are about how trivial it all is?


The same company, DeepMind, already published about reinforcement learners reaching superhuman performance on many realtime games -- old Atari ones

This is not remotely similar to league of legends or starcraft.


I would think LoL and Starcraft would be easier in terms of difficulty than something like Go, because they both have the fast twitch aspect of play. An automated player would have the advantage of perfect situational awareness of the mini map and consistency in character placement and so forth. This is a big advantsge in games where a single misclick or a missed visual can cost a match.


Strategy is what matters here. That's what I was getting at with the last part of my first message. We can make bots that do certain aspects of the game flawlessly and we will need to find a way to integrate them.


It's also very understandable from my perspective that we would have been able to surpass the best players given that monte carlo bots are already around 5-6 dan skill level.


Don't forget that monte carlo bots were around 5-6 _amateur_ dan skill level. It is _very_ different from 9 dan pro.


The difference from 5 amateur dan to 9 dan pro is much smaller than the difference between 10 kyu and 5 amateur dan. I figured google with billions in resources thrown at monte carlo alone would be enough to contend with pros. We didn't exactly have huge organizations creating these bots before.


I know.. don't be negative.. however, that image along with its caption "Pedestrians checking in on the AlphaGo vs. Lee Sedol Go match on the streets of Seoul (March 13)" was too funny.

(It shows pedestrians ignoring the giant screen that is showing the game.)


I suppose it's a better caption than "No one on the streets of Seoul gives a shit about the AlphaGo vs. Lee Sedol Go match"


> It shows pedestrians ignoring the giant screen

I don't see that. How do you know where their eyes are directed when they are facing away from you?


Because most people tend to shift their heads to match their gaze when they watch something for more than a fraction of a second.


This argument is getting silly, but the closest two could definitely have their heads tilted.


I think part of his point is that only two people were looking.


I think it says a lot about how popular Go is in Korea that the games would even be on street signs like this.


Go is relatively popular, but usually not that popular. (The "street sign" is a giant TV screen. I guess it will just show whatever's on TV at that time.)

But last week it seems the whole Korea fixed eyes upon these five matches. The last match was broadcast by all three major TV stations along with several other cable TV stations, with a combined rating of ~13%: better than most shows!

Source (in Korean): http://news.kmib.co.kr/article/view.asp?arcid=0010450700&cod...


I count 7-8 people that appear to be looking at the screen.


The comment about move 37 in game 2 - unlikely to have been made by a human - makes me wonder if AlphaGo considers the chances of an effective human response, either as a learned or innate (programmed) behavior. Unless most of its training has been against humans, I would guess it could not have learned to do that (and I don't even know if it would be a useful metric in Go.)


This isn't something AlphaGo was trained to do - that is, whether or not a board state it has seen was a human or a non-human move was not part of its training model.

To take this to its fullest conclusion, imagine that AlphaGo was not only trained on that information, that is, and that in addition the hundreds of thousands of board states it has seen, it also had layers which encoded analysis of the players before, during, and after the game (their tweets, their weibo messages, and so on). This is the difference between what AlphaGo does and what a human player can do, and what a true strong AI with unlimited compute power could do. AlphaGo can't play as efficiently as possible, it doesn't have that information.

A hypothetically omniscient being, or the Hand of God, can play meta-go. It can play in a way that cause the human player to be less likely to play a winning game. As Eliezer Yudkowsky put it:

    With regards to tonight's match of Deepmind vs. Sedol, an example of an
    outcome that would indicate strong general AI progress would be if a
    sweating, nervous Sedol resigns on his first move, or if a bizarre-seeming
    pattern of Go stones causes Sedol to have a seizure.
The Hand of God could play the game in such a way that, on games 1-3, it ended the game in a board state that issued a threat to Lee Sedol's family.

Fortunately it seems we're still a ways off from playing Go against the Hand of God.


> ... or if a bizarre-seeming pattern of Go stones causes Sedol to have a seizure.

I didn't think I could have a lower opinion of Yudkowsky, but apparently I was wrong. What is this, a Gibson fan fiction?


It's just his particular way of phrasing the concept of playing against the opponent and the game. Sort of like with poker, whereby you play not only against the cards on the table but also the minds of your opponents, knowing what they might do given a partiuclar situation.


I don't think his comment was to be taken entirely seriously, but why do you have such a low opinion of him?


I really don't get the hate against Yudkowsky that I've seen on HN recently. Disagreement is one thing, but it almost seems as if a lot of people here find his ideas and writings insulting.

I don't get it; Yudkowsky is a philosopher and technologist with very interesting thoughts about an important subject that is not studied enough. It would be nice if people could substantiate their criticism rather than resort to name-calling. Everything I've read about Yudkowsky and MIRI seems dead-on: Further research around the issue of long-term AI safety sooner rather than later. I really don't get why this is problematic; I think it's a very good thing that someone smart spends their effort on this.

The comment in question was obviously meant tongue-in-cheek to illustrate a point about potential more efficient, hypothetical ways of winning a game.


I consider his theories as the modern-day equivalent of "How many angels can dance on the head of a pin," or (according to some stories I've read) how 19th century folks worried that large cities would be buried in horse manure.

YMMV, of course.


When Lee got up and left the table for several minutes during the second game, I actually wondered if they were going to find him in the men's restroom in the lobby with his wrists slashed.

It was genuinely kind of spooky when the camera kept returning to his empty chair, and it was a relief when he came back.


>Based on our data, AlphaGo’s bold move 37 in Game 2 had a 1 in 10,000 chance of being played by a human.

This is meaningless without context. What algorithm, has it been tested for calibration, etc. I honestly don't know what I'm supposed to take away from this number.


AlphaGo was initially trained to predict the moves that humans would make, using a deep neural network, and given tons of game records. Then it was trained again with reinforcement learning by playing against itself by taking moves based on that probability, and increasing the probability of moves that lead to wins.

So when they say it predicted 1 in 10,000 chance, it means it thinks it's really unlikely a human would play that move. Just playing random moves means each move has only 1 361 chance at worst, so that move must strongly violate normal human play patterns.


It's also odd that both numbers are 10,000. Maybe their probability function bottoms out there?


More likely just rounded for the purpose of easy digestion in a blogpost


In the Nature paper, move prediction was trained on 160,000 games. So naively, I expect probability can't go below 1 in 160,000.


But those have many more moves.


0 is also a thing.


They describe their algorithm at length in their nature paper, "Mastering the game of Go with deep neural networks and tree search", D Silver at al 2016. http://www.nature.com/nature/journal/v529/n7587/full/nature1...


My understanding is that this is based on the policy network output. If I understand correctly, the policy network is designed to estimate probabilities accurately, although I don't know how accurate it is for low-probability moves like this one.


Sad to see the tournament end! Fantastic entertainment. I have started playing Go again, but I don't think that I will start up again doing Go programming.


Title is misleading. They didn't learn much in Seoul.. they just went there and crushed the best Go player and demolished Go being the poster kid for why AI can't win all games. I honestly have a very hard time thinking that people believed computers couldn't beat humans at Go. That's what a computer AI excel at.. give it a board position and ask it the next best move. Finite input, finite output, almost infinite crunching power.


Because it's not infinite crunching power, and the search space is basically infinite.


Until very recently, the usual sorts of AI really sucked at winnowing through the stupidly broad and deep state trees go has. This is really a nice demonstration of how quickly modern AI has advanced.


Interesting, after the fact people say it was easy. Before, every expert and the DeepMind team said it was going to be hard to win.

Is this a cognitive bias?


This reads like a mix of a disingenuous PR statement and a typical agenda-selling article from some mainstream news org.

Let's look at just one sentence:

>We've also had the chance to see something that's never happened before: DeepMind's AlphaGo took on and defeated legendary Go player, Lee Sedol (9-dan professional with 18 world titles), marking a major milestone for artificial intelligence.

Hype-inducers: "never happened before" "legendary" "marking a major milestone"

Also, AlphaGo didn't "take on" anyone. It plays whatever games are fed into it. It might seem insignificant, but such small details is exactly how most of marketing works. That is how we get "ultimate" luxury cars, "curious" banks, and insurance providers that are "always there for you".

>And because the machine learning methods we’ve used in AlphaGo are general purpose, we hope to apply some of these techniques to other challenges in the future.

AlphaGo's design has a lot of stuff highly specific to Go.


This is legendary. Most people (including me) would have thought this would not be possible for decades.

9-dan is the highest rank in Go. It is not possible to play against anyone higher.

So I am not sure why you think it isn't a big deal.


Because it's a fucking board game. There are tons and tons of more practical and impressive computer science achievements that get absolutely no news coverage.


> more practical and impressive computer science achievements

If we came out with a polynomial time solution for the graph coloring problem, you could use that same logic to say "it's just labeling colors, what's the big deal?"


Graph coloring problem is generic and is guaranteed to be useful in a myriad obvious applications. Plus, "polynomial time" is a specific criteria, whereas "somewhat better than a human champion" is an arbitrary milestone.

Besides, people who make advances in abstract problems like that do not release PR statements and get 1/1000th of the hype and coverage AlphaGo received recently.


I'm not trying to troll you here, but have you ever actually played Go?? Go is so complex that the word 'complex' doesn't even begin to cover it. It's a game that requires equal parts creativity and logic. The sheer number of options absolutely dwarf anything else out there.

This achievement will be useful in a myriad of obvious applications. And, it is one of the greatest engineering feats that I have been alive to witness.


I have played Go and I understand that it is a complex and nuanced game. However, I also understand that it's a game very heavily based on patterns and pattern-matching.

Have you read the research paper and seen how much of AlphaGo's architecture was designed around the specifics of Go gameplay? (Including a set of handcrafted Go-specific training features.)


NxN Go is PSPACE-complete.[1] That's an even bigger deal than NP-complete.

[1] According to some comment in /r/MachineLearning the other day. I haven't looked into it myself.


That is with some bounds applied to it - if we allow unbounded games (ko, superko, etc) then it is EXPTIME-complete.



To be fair, AlphaGo can only play 19x19 Go. It's unrelated to solving NxN Go.


And written language is just a string of squiggly lines...

Are you seriously failing to understand the implications.


Like what?


Oh, gee, I don't know. The first recent thing that comes to my mind and involves ML is a package for fingerprinting software via changes in computer's power consumption. Not just that, but using it to detect malware on medical devices (which often run decades-old OSes and aren't upgradeable). Is that groundbreaking enough for you?


I assume you're talking about this company? http://www.pfpcyber.com/

That's super cool, but no I wouldn't say it's more impressive or groundbreaking than AlphaGo. They're both obviously incredibly impressive though.

I guess I don't get the point you're trying to make. Why does something have to be practical to be impressive? And why don't you think the technology used in AlphaGo is practical?


That's "just" figuring out what's running on a computer.


Sounds like these medical devices are upgradeable if you can install malware on them.


So there's a lot of regulation and compliance stuff that you run into if you start dealing with computers in a medical context. On the face of it, this makes sense; you don't want to install software that hasn't been rigorously tested and approved, because the cost of having a computer fail can be human lives. If you have something that works, it's hard to get the inertia to change it because by the time you've finished testing a new piece of software to make sure it's as stable as you need, it's already out of date.

So in one sense, those machines are unupgradable; you can't upgrade them without violating a number of laws and regulations.

In another sense, they're trivially upgradable; if you don't care about violating laws and regulations, you've got a playground with a number of highly documented vulnerabilities and publicized exploits.

The sorts of people who would install malware on those computers are the sorts of people who aren't really fussed about "laws and regulations"; that's not a position that people running a hospital are really able to take, though.


Spot on. Question is if the entire 'AI' discussion is fair or also hyped by marketing departments. Is AlphaGo really AI, or is it an algorithm focussed on finding the best move to win at the game of GO. When in game 4, people were talking about 'it making a mistake', was it really a mistake. Did it choose the wrong move or did it simply make the move that the algorithm had put forth.


I think they're hinting that the techniques used in AlphaGO are a promising path towards general AI. AlphaGo has NNs trained on Go boards, but champion go players will also have parts of their brain dedicated to their Go intuition that don't transfer to investment banking or playing golf. However the techniques of reinforcement learning can be used to tackle other problems... so we hope.

Also this reminded me of a comment I read last week: "Don't anthropomorphize computers. They hate that"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: