I would call its style unorthodox rather than nonhuman. It still plays common josekis (standard opening sequences) but often chooses uncommon variations. Its mid game is full of startling moves backed by VERY good reading. There's definitely still discernible strategy that us mortals can learn from.
If I recall correctly, the version that beat Lee Sedol was trained on amateur games plus self-play. My guess would be that this new version relies more heavily on pro games.
> My guess would be that this new version relies more heavily on pro games.
Unlikely, since AlphaGo can now generate large numbers of "pro quality" games from scratch. I think it's far more likely it is an autodidact at this point.
A group from CMU appears to have solved no-limit heads-up hold-em. It's only a matter of time (and compute power) for a full ring game.
No-limit is far more difficult than limit due to the risk of catastrophic failure. A Nash equilibrium robot won't make any money. A robot must identify a weakness in you, then deviate from equilibrium to exploit your weakness. So long as you're playing deep stack, you could simply play the Bertrand Russel chicken story (echoing David Hume): The farmer feeds it every day, so the chicken assumes that this will continue indefinitely. One day, though, the chicken has its neck wrung and is killed. It's the "maniac" style. Pretend to be an idiot that plays too many hands. Don't lose your shirt. The robot will learn that you're always bluffing. Eventually you have the nuts and you take everything.
If you have a deep stack you can bluff, but your chances of winning aren't high if you don't have the nuts after all. You can only lose so many times before it becomes the martingale strategy.
This especially doesn't work against multiple opponents.
Their Nature paper says "We trained the policy network p_sigma to classify positions according to expert moves played in the KGS data set. This data set contains 29.4 million positions from 160,000 games played by KGS 6 to 9 dan human players; 35.4% of the games are handicap games."
It is possible that they fed it some pro games after the Fan Hui games but before the Lee Sedol games, but that would be weird; at that point it was already learning from self-play rather than trying to match human moves.
That said, I don't think that Master's better performance comes from being trained on pro games. The AlphaGo version that played Lee Sedol played much more like a human pro than Master does.
There are multiple dan scales. The KGS scale is an amateur dan scale. I don't know how much the scales overlap generally, but I'd imagine a 9 pro-dan professional to be somewhere around 12 dan on amateur scale (pro scales also have more dense scaling). However, the scales reach the ceiling at 9 dan by convention.
Even the abbreviations differ: 9d (amateur dan) vs 9p (pro dan).
KGS 9 dan players are pros or amateurs that are professional level like former insei. The highest rank is almost 11d (it still says 9d, but the graph goes even higher):
Perfect play is likely inhumanly aggressive on blacks part part and white making zero moves. Compared to that this is very human style of gameplay simply based on a different strategy culture as it where.
Black moves first, on a 3x3 and 5x5 both end up 100% black with any white piece being captured. Many other board shapes don't but Go is played on a 19 x 19 board. We don't know about 9x9 or even 7x7 so the pattern is hardly set in stone. Still it seems likely.
Now, with a perfect white play there may be moves an imperfect black player makes which causes white to attack. But, perfect play on both sides probably means any white stone gets captured so white plays zero stones.
If I recall correctly, the version that beat Lee Sedol was trained on amateur games plus self-play. My guess would be that this new version relies more heavily on pro games.