Lee Sedol was also coming directly from playing a tournament against human playe...

awwducks · on March 15, 2016

Yep, I felt the same way. I wonder if the time constraints were optimized for AlphaGo.

One of the things I did want to see was how AlphaGo would fare in a blitz situation (i.e. really short timers).

frankchn · on March 15, 2016

AlphaGo played 5 informal games with shorter time controls alongside the formal games against Fan Hui (the European champion) back in October. "Time controls for formal games were 1 h main time plus three periods of 30 s byoyomi. Time controls for informal games were three periods of 30 s byoyomi."

The games were played back-to-back (formal, then informal) and AlphaGo won 3-2 in the informal games compared to 5-0 in the formal ones, so I would say worse.

jacobolus · on March 15, 2016

The question is whether Alphago’s architecture starts hitting diminishing returns to extra processing faster than top humans is a significantly different question from whether it scales down to a blitz game worse. (Moreover, the difference between 1h main time + 3x 30s byoyomi vs. only 3x 30s byoyomi is absolutely massive.)

Deepmind engineers have stated that the “cluster” version of Alphago only beats the “single machine” version about 70% of the time. This despite the cluster version using like an order of magnitude more compute resources, presumably able to search several moves deeper in the full search tree.

My impression is that there are some fundamental weaknesses in the (as currently trained and implemented) value network, which Lee Sedol was able to exploit. If this is the case, giving the computer time to cover an extra move or two of search depth might not make a huge difference. Giving Lee Sedol twice as much time, however, would have had a significant impact on several of the games in this series, especially the last game. I strongly suspect that with a few extra minutes per move Lee Sedol would have avoided the poor trades in the late-midgame which cost him the game.

hosh · on March 15, 2016

I think the DeepMind team might not even have thought deeply about time control. If we were to express this with the known systems in AlphaGo, how do we express the idea that a surprising move should be given more thought? For example, match 4, move 78 was calculated by AlphaGo as having a probability of being played at 1 in 10,000. Is that something that could trigger a deeper read and use of more time?

Another thing that the commentator was talking about during the the overtime: there would be obvious moves in which Lee Sedol seem to spend a lot of time on. But he was spending most of it thinking of other moves having already decided on what he was going to do. Is that something that could be built into AlphaGo?

Or can we look at how to train a net for time control? Is time control something that has to be wired in?

eru · on March 16, 2016

From what I remember, the time controls were decided by the human, and accepted by the alphago team.