As I noted in my first post, I was a little skeptical about whether a Nash Equilibrium (NE) could be evolved by taking the squared loss of each hand. My conclusions were that, given an expected value evaluation function, it was possible using a best-opponent fitness but not using a squared-loss fitness. There turns out to be a small bug in the squared loss code which caused a big change in results. Below are the results for the correct fitness function implementation.
In part 1 of the tutorial, we setup a basic experiment to evolve a neural network to play Tic-Tac-Toe against a couple of hand-coded opponents. In part 2, we're going to create a competitive coevolution experiment where the networks evolve by playing against themselves.
The Neuro-Evolution via Augmenting Topologies (NEAT) algorithm enables users to evolve neural networks without having to worry about esoteric details like hidden layers. Instead, NEAT is clever enough to incorporate all of that into the evolution process itself. You only have to worry about the inputs, outputs, and fitness evaluation.
I'm the kind of person who finds himself reading about a new technology or a cool algorithm, and tries to implement it based on the high-level description. Unfortunately, I don't always guess everything correctly, and sometimes the implementation turns out to not work; or it kind of works, but not as well as expected, which can be even worse.
A key example of this for me was when I read about Evolutionary Algorithms. At the core, it's sounds so ingeniously simple:
- Create a population of individuals
- Score the individuals based on some performance metric
- Kill off the weakest performers
- Create children from the surviving parents
- If not finished, go to #2
That's really it, right? I always thought so.
I recently read this paper where the authors claim that they are evolving psuedo-optimal strategies for poker. Given that we know evolutionary processes do not necessarily minimize losses, I was skeptical.