I'm surprised how slow the press has been to pick this up. This seems like an amazing step forward to me. AlphaZero played only against itself as training and it beat one of the best chess AIs in the world that has been finely tuned with decades worth of human knowledge.
Now that Go and Chess are efficiently solved for AI...what's next? Are there any other interesting complete information games remaining? What's the next milestone for incomplete information games?
> Now that Go and Chess are efficiently solved for AI...what's next?
I'd like to see an AI that can learn things related to what it already knows with very little training, like humans can do.
Take chess. There's a chess variant that is popular between rounds at tournaments, called "bughouse". It's played by two teams of two, using two chess sets and two clocks. One member of the team plays white on one of the boards, and the other plays black on the other board.
For the most part, the game on each board follows the normal rules of chess. Whichever individual game ends first determines the outcome of the combined game. The teammates may talk to each other and collaborate during the game.
The big difference between the game on each board and a regular game of chess is that on your turn you can, instead of making a move on the board, pick up any piece that your partner has captured, and drop it anywhere on your board (with some restrictions, such as pawn cannot be dropped on the first or eight rank).
If you take a human who has learned to play chess to a certain level, and who has never played (or even heard of) bughouse, and explain the rules to them and then have them start playing it only takes them a little while to get about as good as they are at normal chess.
The human quickly figures out what chess knowledge transfers directly to bughouse, which needs tweaking, and which needs to be thrown out. As far as I know, current AI cannot do that--to it bughouse is a completely new game.
Does it really matter that it views bughouse as a new game? If it can still learn it in a matter of hours as an entirely new game then, although there's probably efficiency that could be gained from knowledge transfer, I can't really imagine it would need to figure that out until after it's done. I would think that, once it's figured out how to play a few games from scratch, it can check to see what commonalities those have and then apply them to even more games/ideas/processes.
>How about axioms of logic as legal moves, and asking it to go from a set of axioms to open mathematical problems?
>Or chemical procedures and components as moves, and asking it to tackle a disease with a known molecular structure?
>It is not as straightforward as I make it sound, but these are complete information problems.
However they are not adversarial games. Self-play only works for adversarial games. Also each mathematical problem is different and needs to be solved only once, and likewise for molecular structures. Additionally, they only need to be solved once, so there is no "gradient of skill" that we know to climb.
Theorem proving in intuitionistic logic is a two-player game and maps perfectly to the kind of Monte-Carlo Tree Search that's employed here. Except that it is far more difficult than Chess/Go/etc., since the branching factor is essentially unbounded.
Consider a formula made up of only conjunctions and disjunctions and true/false. The first player tries to prove the formula and gets to move at every disjunction and is allowed to select which side of the disjunction to prove. The second player tries to prevent the first player from finding a proof and gets to move at every conjunction, selecting a side of the conjunction to descend to. The final state here is an atomic proposition which is either true or false and determines which player won. You derive a value function from that in the same way as you do for Go or Chess.
You can extend this idea to full first-order intuitionistic logic and probably also to higher-order logics, as well as many different modal logics. There are also formulations of classical logic as a single player game, but that doesn't seem to be very useful here.
Unfortunately the things AlphaZero can do is much more restrictive than that -- it needs a very small, fixed state of the world and set of possible moves.
While chess if very complicated to play, the state can be represented by (at worst) an 8x8x6 boolean array (board, one of 6 possible pieces), for go a 17x17x2 boolean array. There is nothing similar for logic and and deep learning breaks down (in my experience) once you don't have that nice regular input space.
> How about axioms of logic as legal moves, and asking it to go from a set of axioms to open mathematical problems?
I'm not sure how you could train it against itself. The branching factor would be infinite as well so I don't know how you'd constrain the legal moves. For example, maybe rewriting (a * 2) to (a + a) or (a * 4 - a * 2) and so on for every theorem you know would be useful in a proof, plus you can invent your own theorems as intermediate steps.
>Now that Go and Chess are efficiently solved for AI...what's next?
Chess is not "solved." Solving chess would mean knowing with absolute certainty the best move in any position. It would also mean knowing the best first move. We do not have technology with the computational capability to "solve chess." We have technology that can play chess better than any human, but that's not the same as solving it.
I'd like to see a game where each player can take a variable number of actions each move. For example, Risk. (Probably not too difficult an enhancement)
I think the best current Poker AI (which can beat good humans) doesn't use neural networks, but expect that combing counterfactual risk minimization with neural networks shouldn't bee too hard.
RTS games should be the next bigger challenge. Not only do they have incomplete information, you also have to handle widely different scales (both space and time) and have a large number of units. I expect this to require an interesting new approach to integrate large scale strategies with small scale tactics without getting stuck in local strategy optima.
> Now that Go and Chess are efficiently solved for AI...what's next?
Cleaning toilets? I would love to see AI doing degrading but useful work. Sadly, all we get is computers playing board games.
EDIT: YC is doing UBI experiments now. How about funding startups building menial labor robots, then passing the savings on to the humans who used to do the work?
The interesting thing for me (apart from the very short training period) is that this seems to be a more generalised version of their previous AlphaGo engine, which means that in future it should be even easier to adapt and use for other tasks.
There is always Infinite Chess https://en.wikipedia.org/wiki/Infinite_chess and other games with complete information, but unbounded state. (Or even just bounded, but finite, such as Hex.) Monte Carlo + NN seems like a good approach, so maybe it'd be worth a go.
Imperfect information games seem like a much more interesting challenge though.
Probably. On the other hand, it wasn't five years ago where a majority of randos on the internet thought that AI would never defeat top humans at Go. (Sorry for citing such an unimportant population, but it is the one I'm a part of and whose opinions I have access to.)
While I wouldn't exclude possible breakthrough, it's definitely a different case. Go was feared because of an enormous number of possible states, but AlphaGo circumvented the issue by not caring about it. It didn't "solve" Go, it just plays it better than humans.
Starcraft 2 state is much bigger than Go's and requires real time action. Also, actions have to be performed in parallel in real time, which probably requires different techniques and probably puts a way lower bound on number of iterations doable in given time.
I always believed that one of the core issues with Starcraft is considered to be that it is not a perfect information game (in addition to inherent complexity). Having to deal with learning how the opponent plays, and basically guess and make inferences about what they are doing, is somewhat new territory.
Yeah, not really. Ask anybody working on distributed systems.
When AlphaZero plays Go with itself, it can basically go as fast as it can calculate next move. With Starcraft that's not the case - both networks have to work in sync, probably need some temporal awareness and probably will have some limit of actions per time fraction, which basically requires a whole new approach. Of course, I can be gravely mistaken, but I would like to now how they can circumvent this.
In SC there are time constraints in that you're getting resources at a certain rate. You have to grow your capacity to pull resources and at the same time be building units to defend yourself. If you allocate resources poorly, you'll find yourself losing. AlphaGo can't ignore that, but...
I think AI does have an advantage once it starts to be competent, especially if it's interacting through APIs exclusively and not the interface, which means that it's actions per minute could be astronomically higher than a human player, with unheard of levels of micro. At the same time, I think machine learning is almost an idea solution to figuring out build orders. It's gonna be fast and smart. Question just is how long?
I think even with unlimited APM, humans can still beat AI using cheese strategies, like an undetected cannon rush, since you can't really micro your workers against cannons (the projectile isn't dodge-able like the one from siege tanks).
Otherwise, you make a fair point and that video is amazing. AI vs AI strategy with unlimited APM would be very exciting to watch.
It is generally assumed that SC and SC2 are not actually just Rock Paper Scissors. That is, you're not obliged to guess your opponent's strategy and counter in the dark but can instead "scout" and figure out what they're doing and overall this can beat a "blind" strategy like cannon rush that doesn't respond to what the opponent's strategy is.
For example the "All ravens, all the time" Terran player Ketrok just responded to the surge in popularity of Cannon Rushes by making a tiny tweak to his opening worker movement. The revised opening spots the Cannon Rush in time to adequately defend and thus of course win.
> especially if it's interacting through APIs exclusively and not the interface, which means that it's actions per minute could be astronomically higher than a human player, with unheard of levels of micro
There's no way they'll compete under unlimited APM rules, it wouldn't even be remotely interesting. We're trying to match wits with the AI, not the inertia and momentum of super slow fingers, keys, and mouse.
I'm sure they'll come up with an "effective APM" heuristic which compares similarly to top pros, and feed it as a constraint to the AI.
There's also the fact that an action you trigger now (build unit) doesn't have immediate payout (building takes time). In Chess it can evaluate the current board as is, and it gets immediate feedback on each move.
I'd be interested to see if it can "plan ahead". Maybe a Chess variant where you have to submit your next move before the current player moves, or something like that.
It is not a sequential perfect information game. Information is imperfect, actions can be taken by both sides at the same time, there probably will be action limit per time/game frame and the network will have to determine not only its next move, but also manage time for calculating that move. It totally changes the challenge.
As far as I understand (and I am no expert at all), AlphaGo basically creates a heuristic of what move to play in a given situation (which heurisitic is created by playing against itself many, many times). Instead of trying to "break" the game, they just decided to simulate playing and results were good enough to outmatch humans, but we have no idea how close to the "perfect game" AlphaGo actually got.
But - whole input to a network is 19x19 array with 3 possible states per cell, plus maybe turn count and one bit for determining whose next move is. S2 network should process graphic stream (lets say 1280/720), needs spatial awareness(minimap), priority setting and computational resource management. And it has to be fast enough in the first place just to follow the game.
I'm not saying that won't happen (who predicted Go breakthrough?), but it at least seem like a much bigger challenge.
I would love to see it tackle my favorite RTS (Supreme Commander) but really, what would be interesting would be to have it attack 'real life' problems :
-logistics
-subway circulation
-city planning
-detect diseases
-optimize the energy efficiency of a building.
(just armchair opinion, I don't know how well suited AZ would be to these problems. I do know that AI has already helped with some of these though)
> Are there any other interesting complete information games?
Not really, given that they already tried Shogi too. Personally I would love to see AlphaZero playing Arimaa, but it is probably not as interesting because of its short history.
I actually prefer "expert iteration" and "tree-policy target" term instead of "AlphaZero algorithm", because they emphasize what is novel about it among the entire system. Terms are from https://arxiv.org/abs/1705.08439
I'd like to see an AI figure out how to beat Zelda for the NES without being given any information specific to the game. This is something pretty much any human can do yet it appears daunting for AI.
Still need to solve the game model development problem and the I/O problem.
The game models for each Alpha Zero (Chess, Shogi, and Go) look to have been created by a human, as well as the input and output translation (e.g., a human would need to intervene[1] to help AZ Chess to play and win Atari 2600 Video Chess).
[1] Intervene by doing the translation by the human or by writing a program to convert AZ I/O to Atari 2600 I/O.
That's not why humans play games so why should it be for AI? We didn't consider running "solved" after the first Olympic games. AlphaZero is just better than any other algorithm at this time. A year ago that wasn't the case, in a year it may also not be the case. We learned something about developing AIs here which we can apply to the next. That doesn't mean Chess and Go are of no value to AI researchers, it shows that they are fertile ground for discovery.
Imagine how extremely frustrating it would be to have a bad base position in comparison to your enemy, or your enemy having an island-like terrain when you build all ground units and could not scout quick enough...
I understand what you're saying, but you will also have to realize that SC2 strategies are map and race specific. If you add in randomized maps... it really takes out a dimension of the game. It adds another dimension as well, of course, but that is just randomness.
Procedurally generated but still symmetrical maps would actually be really interesting. It would be like having a new battleground to explore every match without giving an advantage to either side.
Symmetry is not the only map characteristic that gives advantages to one race over another. Terran is really good at sieging, Zerg is good at expanding early, Protoss is very bad at defending open bases, etc.. The pro-played maps are carefully designed to balance all that, and when they fail the players themselves limit their strategies according to what works and what doesn't on each map.
The players even get to choose which maps they'll play (by eliminating some map if they feel it's unfair) and in which order...
I understand the mechanics of the different races(Zerg for life) and you're right that it would completely remove the current "map selection meta" which I would consider as an important strategic part of the series games.
I question whether the author has actually played Starcraft. It's clearly vastly more complicated from an AI perspective than games like Chess or Go, and this has been obvious for some time.
Now that Go and Chess are efficiently solved for AI...what's next? Are there any other interesting complete information games remaining? What's the next milestone for incomplete information games?