Acquisition of chess knowledge in AlphaZero

bnprks · on Nov 20, 2022

> Data, Materials, and Code Availability

> [...] However, sharing the AlphaZero algorithm code, network weights, or generated representation data would be technically infeasible at present.

Very interesting paper overall. However, the excuse that code sharing is "technically infeasible" is wearing thin nearly 5 years after the initial AlphaZero paper was released.

mgraczyk · on Nov 20, 2022

My assumption is that the code is not separate from other deepmind code. It would be very difficult for them to share the AlphaZero code without also sharing everything else deepmind is working on.

The representations are probably not manifest in a way that would be intelligible if shared.

I don't have an explanation for why they wouldn't share the weights.

espadrine · on Nov 20, 2022

In some frames of the Deepmind documentary film on AlphaGo, we can see code for loading SSTs (a common key-value data format at Google) from GFS (the Google file system).

It is possible that the entire codebase depends on Google-only infrastructure.

mgraczyk · on Nov 20, 2022

That part is true, but things like that are usually not too bad by themselves. For example, you can use open source tensorflow to access files on Google's internal filesystem with tf.io.gfile.

It's possible other infra is somewhat hairy to decouple, for example the code they use to allocate and use GPU resources is internal.

(I work on ML at Google and we use some of Deepmind's stuff)

heavenlyblue · on Nov 21, 2022

Surely all of that could easily be replaced with a "virtual" version of these interfaces

tromp · on Nov 20, 2022

> Summary of Results

> Many Human Concepts Can Be Found in the AlphaZero Network.

> We demonstrate that the AlphaZero network’s learned representation of the chess board can be used to reconstruct, at least in part, many human chess concepts. We adopt the approach of using concept activation vectors (6) by training sparse linear probes for a wide range of concepts, ranging from components of the evaluation function of Stockfish (9), a state-of-the-art chess engine, to concepts that describe specific board patterns.

> A Detailed Picture of Knowledge Acquisition during Training.

> We use a simple concept probing methodology to measure the emergence of relevant information over the course of training and at every layer in the network. This allows us to produce what we refer to as what–when–where plots, which detail what concept is learned, when in training time it is learned, and where in the network it is computed. What–when–where plots are plots of concept regression accuracy across training time and network depth. We provide a detailed analysis for the special case of concepts related to material evaluation, which are central to chess play.

> Comparison with Historical Human Play.

> We compare the evolution of AlphaZero play and human play by comparing AlphaZero training with human history and across multiple training runs, respectively. Our analysis shows that despite some similarities, AlphaZero does not precisely recapitulate human history. Not only does the machine initially try different openings from humans, it plays a greater diversity of moves as well. We also present a qualitative assessment of differences in play style over the course of training.

wwarner · on Nov 20, 2022

I think this is great work. Interpretability is the worst problem in deep learning, as the lack of insight into what the model has learned prevents it from being useful for serious decision making.

retrac · on Nov 20, 2022

It's not just a practical problem; it's one of the most important philosophical problems in the area, too.

Something like GPT-3 can do multi-digit arithmetic much better than chance, giving results for values it was certainly never trained on. Similarly, transfer learning, where you start training a model on some input less related to the task, and then switch to inputs closer to your task at the end, can substantially reduce total training time. The task can be radically different; to use GPT-3 as an example again, compared to starting with a completely randomized model, it reduces training by a factor of about 10x to go from PCM audio samples encoded as text patterns, or abstract art bitmaps encoded as text patterns, to English text. GPT-3 is learning something about arithmetic. It's learning something that is common to music, abstract art, and English text. It might be as simple as basic patterns from geometry and arithmetic (that's my guess). But no one could even begin to point you in the direction of what that structure it is teasing out really is.

jameshart · on Nov 20, 2022

How much insight into what humans have learned is necessary before you find them useful for serious decisionmaking?

civilized · on Nov 20, 2022

If I needed large numbers added together, I would trust a human who can explain their general addition algorithm to me, and I wouldn't trust an AI that can spit out some usually correct answers on small problems.

hnews_account_1 · on Nov 20, 2022

The explicability of our decision making is like 95% of our progress. It is why we can identify biases evolution builds into us but we need to fight against. Or try and justify something beyond mere intuition. This is like asking how much math are humans using anyway.

Jensson · on Nov 20, 2022

Humans as a group has proven to be able to build everything we have today. No AI has proven to do anything similar, there just isn't much data there to make us confident in their abilities thus far.

gateorade · on Nov 20, 2022

Conversely though, how many people are killed in the world every single day simply because of human error? When (in the US) a 16 year old gets their license, we don't ask them to provide a formal proof of their driving technique that shows it's impossible for them to ever get in a crash. We say 'you've had the training, you've demonstrated that you can safely drive, be careful out there'.

mannykannot · on Nov 20, 2022

We have a good understanding of this risk and have collectively chosen to accept it (while many countries have chosen not to accept it in the case of 16-year-olds.)

jojobas · on Nov 20, 2022

You can question a human in an exam and ask to "show your work". If you can say solve an equation but can't explain why you did this or that transformation you'll be rightly failed. With current NNs you get an answer and that's it, there's no introspection.

Barrin92 · on Nov 20, 2022

I skimmed the article so sorry in advance if I missed it, but to me one fairly trivial way to gauge whether AlphaZero has human-like conceptual understanding of chess would be to throw a few games of Fischer random at it.

I remember with Deepminds breakout AI one very easy way to see the difference to human play was to change the shape of the paddle. Even very slight changes completely threw the AI off, so it was obvious it hadn't understood the 'breakout ontology' in a human way.

I'd expect the same from chess. Humans who understand chess at a high level well obviously play worse in non-standard variants but the familiar concepts are still in play. If an AI has a human-like grasp of high level concepts it ought to be pretty robust to some changes to the game rules like changing the dimensionality of the board.

FartyMcFarter · on Nov 25, 2022

Regarding the breakout AI, I think you could somewhat easily train an AI that was robust to such changes, for example by randomizing the paddle shape in training. But if you train an AI on a more specific version I wouldn't expect it to magically learn the more general problem.

In fact I'd argue it's unfair to expect an AI to automatically generalize without attempting to train it for that.

The only reason humans can quickly learn many games is that they've been exposed to a variety of tasks throughout their lives, as well as the fact that we have innate biases that we might use when designing games by humans for humans.

EvgeniyZh · on Nov 20, 2022

I think many chess players will agree that latest chess engines (Stockfish NNUE/Leela) are playing better conceptually, so it's less useful to use older ones (SF8/A0) to study learned concepts. Still cool work tho.

tiagod · on Nov 20, 2022

You'll find that the abstract makes the explicit point that Alpha0 lends itself to more interesting findings as it had no exposure to how humans think about chess.

EvgeniyZh · on Nov 21, 2022

Both SF and Leela use NN eval tho, not trained on human games

mtlmtlmtlmtl · on Nov 20, 2022

Haven't read the paper yet but given relevance to what I'm doing atm it's high on my list.

I think using pre-NNUE Stockfish is partly because the classic Stockfish evaluation function has a lot of human knowledge explicitly built in in already interpretable ways, making it a good contrast for comparison.

civilized · on Nov 20, 2022

The goal is to discover and analyze the concepts used by an expert AI player, so it isn't vital that the player be the strongest so long as it is expert.

AlphaZero is also more interesting when you want to know how a general game-playing AI trained from scratch approaches the game.

dsjoerg · on Nov 20, 2022

Anyone know how this differs from a similar-seeming paper that was published a year ago?

https://en.chessbase.com/post/acquisition-of-chess-knowledge... https://arxiv.org/pdf/2111.09259.pdf

sega_sai · on Nov 20, 2022

It's the same paper, it was just accepted to pnas/(published on the website).

osigurdson · on Nov 20, 2022

Here is a thought experiment for beating AlphaZero. Randomly select 10K children at a very young age (say 3), have them play chess against AlphaZero but simply have them move the exact move suggested by AlphaZero (i.e. basically this is AlphaZero playing itself). Play 10 games per day for 10 years.

The hypothesis is some children will deeply embed the algorithms into their own playing style - leveraging the subconscious to the greatest degree possible. Basically, we are training the human mind in the same way that we train AI. Would it work? Probably not, but our current approach (studying openings, etc.) is obviously not working so it makes sense to try something new.

macspoofing · on Nov 20, 2022

>Would it work? Probably not

"Probably not" indeed. Human brains need to be stimulated in order for them to build their neural net. Blindly following instruction is not stimulating.

>but our current approach (studying openings, etc.) is obviously not working

That's part of the current approach, not the full approach. Kids that show promise in Chess, already train with computers today (in addition to personalized coaching, and strategy study). I have no doubt that the current generation of chess players is best of all time, and that the next generation will be even better. Even with that, I don't think any human will be able to train enough to beat even Stockfish, much less Alpha Zero - just as no human will ever train enough to beat computers at arithmetic.

osigurdson · on Nov 20, 2022

>> just as no human will ever train enough to beat computers at arithmetic.

Think of the amount of complex computation must be happening for an elite gymnast for example. Sure, they aren't solving the equations consciously but at some level computation is happening - or even something as simple as making sense of all of the individual photons arriving at the eye. The subconscious is what I am suggesting we try to exploit. Even then, it seems likely that a "special" brain would be needed - hence why large numbers would be required in order to find it.

macspoofing · on Nov 21, 2022

>Think of the amount of complex computation must be happening for an elite gymnast for example.

By that logic, my cat should be trainable to a Chess Grandmaster level because she performs complex computation as she navigates my backyard.

>The subconscious is what I am suggesting we try to exploit.

Is there any evidence that a subconscious is exploitable it that way ... at all. Because learning doesn't work that way - especially for higher-order knowledge like Chess.

pvitz · on Nov 20, 2022

Stockfish is stronger than AlphaZero.

firsttimebigboy · on Nov 20, 2022

What makes you say that? Everything I've seen says AlphaZero beat SF8 easily: https://en.wikipedia.org/wiki/AlphaZero#Chess_2

goatlover · on Nov 20, 2022

Because that was several years ago. SF has improved since then.

cfreksen · on Nov 21, 2022

To elaborate, in the Top Chess Engine Championship (which is also used as a benchmark in the AlphaZero vs Stockfish comparison) Stockfish has won all seasons since May 2020[1,2]. A common runner up for those seasons is LCZero[3], which is an engine that is derived from/a reimplementation of AlphaZero. Stockfish is also ahead of LCZero in Fischer random chess (as hosted at TCEC) winning 4 out of the 5 FRC tournaments.

As for how close Stockfish 8 is to a current version of Stockfish, Stockfish 15 has ~400 more ELO than Stockfish 8 as measured in their own self-tests[4].

However, when making these comparisons between AlphaZero and Stockfish it might be tempting to frame it as machine-learning-engine vs classical-engine, which is not true. Stockfish incorporates a neural network when evaluating chess positions[5].

[1]: https://en.wikipedia.org/wiki/Top_Chess_Engine_Championship#...

[2]: https://en.wikipedia.org/wiki/Stockfish_(chess)#Top_Chess_En...

[3]: https://en.wikipedia.org/wiki/Leela_Chess_Zero

[4]: https://github.com/glinscott/fishtest/wiki/Regression-Tests#...

[5]: https://stockfishchess.org/blog/2020/introducing-nnue-evalua...

pvitz · on Nov 21, 2022

Great summary! Also, at [1], one can hear Deepmind hinting at that state-of-the-art engines (Stockfish, Leela Chess Zero) will beat Alphazero.

[1] https://www.youtube.com/watch?v=N2MncpRMnFA&feature=youtu.be...

dankwizard · on Nov 21, 2022

Stockfish 15 is much stronger than SF8.

Shorel · on Nov 21, 2022

But that's just the latest Stockfish, after LeelaZero implemented AlphaZero strategy and beat Stockfish, and then Stockfish had to add neural networks to its code to stay competitive.

macspoofing · on Nov 20, 2022

Not the point.

cognaitiv · on Nov 20, 2022

My 10yo daughter plays a chess app mostly with suggestions turned on. It has substantially improved her game when we play together. Way more effective for her than my coaching, although I’m a total amateur. I think this would work without a doubt.

Shorel · on Nov 21, 2022

I think that would be transferring one particular network in a fuzzy way to a set of children, but it would not be transferring any of the training tools and feedback models that can improve the network.

The biggest failing would be, of course, the children would not know how to play against moves that are not optimal.

This is the most important part of chess knowledge: given some suboptimal play, prove it is suboptimal by defeating the player who made the suboptimal move.

Otherwise, as some chess schools do, they would be repeating famous openings by rote, without understanding the meaning behind each move in the opening.

This ability would matter little when playing against Alpha Zero, but it would make all the difference when playing against humans.

sinuhe69 · on Nov 21, 2022

I find the question interesting and thought-provoking, so very much in the spirit of HN. But I find it distressing that some were quickly to downvote it (without even a discussion?). AFAIK, downvoting on HN is not like Unlike on other social networks. You should downvote only if the comment is irrelevant, nonsensical or violates the community guidelines. The parent comment is in my opinion none of these.

Waterluvian · on Nov 20, 2022

Does AI still struggle with “I can’t tell you how I derived this answer”? Is that improving much?

firsttimebigboy · on Nov 20, 2022

AI doesn't even attempt that. All our neural networks do is spit out the final result of whatever it's doing. If we want the "reasoning" for it we have to build different methods of doing that. It's an active field of research but much slower going than improving our AI's directly.

Tenoke · on Nov 20, 2022

Yes exploitability is improving but this hasn't been that much of a problem with top chess engines at any point.

ambyra · on Nov 20, 2022

I always wondered if a chess engine would learn better/faster if the opening positions and piece movement rules were randomized. Has anyone tried this?

spuz · on Nov 21, 2022

AlphaZero has been used to more objectively study chess variants in this paper: https://arxiv.org/pdf/2009.04374.pdf

I don't think they mention anything about speeding up training. The original training time for AlphaZero was only a few hours anyway so I don't think that was ever a major constraint. I would imagine that each neural network performed best on the variant on which it was trained.

tromp · on Nov 20, 2022

There is some discussion on reddit:

https://www.reddit.com/r/chess/comments/c4dgas/alpha_zero_fi...

gliptic · on Nov 20, 2022

If the piece movement rules are randomized, it's not going to be learning chess, is it...

firsttimebigboy · on Nov 20, 2022

Maybe they meant the piece type is randomized, because each piece has a different movement rule. This would result in things like maybe 5 pawns and 3 rooks instead of the normal numbers.

awb · on Nov 20, 2022

That would be Fischer Random Chess (aka Chess960). I know engines play it, but I don’t know if they’re trained on it.

umanwizard · on Nov 20, 2022

That indeed randomizes the starting position (subject to some constraints), but it doesn’t randomize the abilities of the pieces.