Blog<T>: May 2011

Disclaimer: I'll be taking a departure from my usual weight loss/running posts this month to talk about my latest Scala project. So those of you who are not interested in programming or neural networks can feel free to tune out.

...Continued

My first post on neural networks and genetic algorithms explained the basics. The second one showed some code and talked a little about the implementation of the XOR function. For my final post, I've created a simple game and trained a neural network to play it.

The Game
The game is simple. It consists of a square area that is populated with targets in random locations, and the player. The objective is to move the player such that it "picks up" (touches) some targets while avoiding others. Picking up a good target awards the player 3 points, while touching a bad target subtracts 5 points. The player is judged on how many points he can accumulate after 100 turns.

The above image shows what the board looks like. The black circle is the player. The green circles are the "good" targets (+3 points) and the red circles are the bad ones (-5 points). The light green diamonds are the good targets that have been picked up and the orange stars are the bad targets that have been touched.

The AI
The AI for the player consists of a neural network trained by a genetic algorithm. The network itself has a number of inputs including player position, the player's last move, the closest 4 good targets (the dark green circles), and the closest 4 bad targets (dark red circles). The hidden layer consists of 2 layers of about 20 and 5 neurons respectively. The output layer is 2 neurons. One for the horizontal movement and one for the vertical movement.

The game itself constrains how far the player can move on a turn, which means that the output neurons mostly just give a direction vector and not an actual offset position. However, if the magnitude of this direction vector is less than the maximum turn distance, the player will use that for its actual distance. This allows the player to potentially make fine-tuned moves.

The training algorithm ran for 1000 generations with a population size of about 500. The training data was a set of 4 randomly generated boards and 1 static board. The fitness of each individual is basically the score for each board plus a small factor based on how quickly the player accumulates his score. This selects primarily for highest score, but secondarily for speed at which and individual can find the good targets.

The Results
The algorithm worked well. Here's a sample of the best individuals at the end of several generations:

This is basically the best individual out of a set of 500 randomly generated networks. As you can see it does a pretty good job of avoiding the bad targets, but it gets stuck on the left side pretty quickly, not knowing where to go.

By the 20th generation, the best individual is a little better at picking up the good targets. But towards the end, it gets a little confused, oscillating back and forth as it sees different targets.

Generation 100 has stopped caring about the bad targets so much. It looks like it's preferring to move in a single direction for a number of turns before making a change. This probably has to do with the fitness function's secondary selector which is based on the speed at which the score is accumulated.

Here are links to generations 200 and 500. You can see the player getting better at quickly picking up the good targets.

By generation 1000 the player is almost able to get all of the good targets in the 100 turns. It is also reasonably good at avoiding the bad targets although there are some odd moments where it seems like it's deliberately picking them up.

Lessons Learned
You've probably noticed that the neural network only moves the player diagonally. This is largely because of the activation function that limits the output between -1.0 and 1.0. Meaning that excessively large numbers are around 1 while excessively low numbers are around -1. Comparatively, the -1 to 1 range is a small target to hit. This means that the <-1,-1> <-1,1> <1,-1> and <1,1> moves are somewhat selected for because they are the easiest for the network to attain. If I were to do it again, I'd probably drop the activation function entirely and just use the output neurons as a direction vector.

You also probably noticed that there is a giant leap in ability from the first generation to the 20th and 100th generations, but only a smaller leap to the 500th and 1000th generations. This is because most of the improvement in a genetic algorithm happens quickly with only small refinements in later generations. I actually had to tweak the size of mutations so that smaller increments were possible as the generations increased.

Finally, the entire training period took about 6 hours on my quad-core desktop PC. You might think that's a long time, but just think about how long it might take to actually implement the decision logic in code. I was able to do something else for those 6 hours while my PC worked tirelessly toward a solution. The brain power in a neural network and genetic algorithm is mapping the inputs and outputs to the problem at hand, and figuring out how to determine fitness of an individual. But once you have those things, the solution is found automatically by the algorithm. The results might not be a perfect but you can get pretty good results.

Next Steps
I noticed a shortcoming of neural networks almost immediately. The outputs are a series of additions and multiplications. You can't divide, you can't take a square root, you can't loop over a dynamic-length list, and with an acyclic network, you can't story anything resembling a memory. You can get fairly good approximations for a lot of problems, but it can still be fairly limiting. My next area of research is something called Genetic Programming. It's the same process of evolving a population over a number of generations, but instead of the "chromosome" representing a neural network, it is the source code itself. It is an actual runnable program that changes its structure and execution to improve on the original. And since it is just a program, it is not limited by the additions and multiplications that comprise a neural network.

And that's all for now. We'll be returning you to your regularly scheduled fitness talk next time. Thanks for bearing with me as I strayed from the path a little bit.

...Continued

Last time, I talked about what neural networks are and what they might be used for. I also talked about a couple of the methods used to train them. If you haven't read it, I recommend you start there, because part 2 will focus heavily on the implementation.

Implementation

I've been playing around with Scala for about 2 years now. I've got a couple of unfinished projects that I've been using to learn my way around. It seemed like a natural fit, and it's more fun than Java, so that's what I went with.

I started by separating the project into 2 core concepts. The first was the neural network implementation. Basically, the collection of input neurons, the hidden layer, the output neurons, and the connections between neurons of sequential layers. The implementation of a neural network is actually pretty simple. It's basically a series of multiplications and sums. I tried to keep it simple so that it focuses solely on the calculation.

The second was the mechanism by which the neural network learns. For this, I implemented both backpropagation and a genetic algorithm. However, I soon realized that the genetic algorithm could be used for other purposes besides just training a neural network. So I pulled that out into its own module.

The final structure consists of 3 parts:

The neural network.
The genetic algorithm.
The neural network learning mechanism

The learning mechanism can further be organized into:

Backpropagation learning
Genetic algorithm learning

Most everything is implemented as Scala traits, making it easy to mix and match different implementations.

The Code

So, that's the boring stuff, let's see some code (to see more, you can check out the repository on github.) The XOR function is used in just about every example out there on the web when looking for problems to solve with a neural network. It's well-understood, simple, and non-linear. And it's also easy to show example code in a blog. :)

This first example shows how to set up a neural network that learns how to calculate the XOR function via backpropagation:


object XORBackProp {
  def main(args : Array[String]) : Unit = {
    val inputKeys = Seq("1","2");
    val outputKeys = Seq("out");
    val hiddenLayers = Seq(4,2)

    val testData = IndexedSeq(
     (Map("1" -> 1.0, "2" -> 0.0),Map("out" -> 1.0)),
     (Map("1" -> 0.0, "2" -> 0.0),Map("out" -> 0.0)),
     (Map("1" -> 0.0, "2" -> 1.0),Map("out" -> 1.0)),
     (Map("1" -> 1.0, "2" -> 1.0),Map("out" -> 0.0)))

    val network = new Perceptron(inputKeys,outputKeys,hiddenLayers) 
     with BackPropagation[String] with StringKey;

    //Initialize weights to random values
    network.setWeights(for(i <- 0 until network.weightsLength) yield {3 * math.random - 1})
    println(network.getWeights);

    var error = 0.0
    var i = 0
    var learnRate = .3
    val iterations = 10000
    while(i == 0 || (error >= 0.0001 && i < iterations) ){
      error = 0

      var dataSet = if(i % 2 == 0) testData else testData.reverse
      for(data <- testData){
        val actual = network.calculate(data._1)("out")
        error += math.abs(data._2("out") - actual)
        network.train(data._2, learnRate)
      }
      if(i % 100 == 0){
        println(i+" error -> "+error
          +" - weights -> " + network.getWeights
          +" - biases -> " + network.getBiases);
      }

      i+=1
    }

    println("\nFinished at: "+i)
    println(network.toString)

    for(data <- testData){
      println(data._1.toString+" -> "+network.calculate(data._1)("out"))
    }
  }
}

The key is this line:


val network = new Perceptron(inputKeys,outputKeys,hiddenLayers)
with BackPropagation[String]

It sets up a simple neural network (Perceptron) with the inputs, outputs, and the number of neurons in each hidden layer. The BackPropagation trait gives it the ability to learn using backpropagation.

The rest of the network initialization is configuration for the backpropagation. "learnRate" determines the amount to change the weights based on the error from the test data. Finally, at the end, we are printing the results of the neural network when run against the test inputs:


Map(1 -> 1.0, 2 -> 0.0) -> 0.9999571451716337
Map(1 -> 0.0, 2 -> 0.0) -> 4.248112596677567E-5
Map(1 -> 0.0, 2 -> 1.0) -> 1.0000125509003892
Map(1 -> 1.0, 2 -> 1.0) -> 1.0286171998885596E-7

And here's a graph showing the error versus backpropagation iterations for a few different executions.

Notice that run 2 never reached an acceptable error. This is because of the local minimus problem with backpropagation. Fortunately, each of these runs took about a second, so it's relatively easy to just reset the training until you get an acceptably close solution. This may not be the case for every problem however.

This second example shows how to set up an XOR neural network that learns via a genetic algorithm:

object XORGeneticAlgorithm2 {
  def main(args : Array[String]) : Unit = {
    val xorTestData = IndexedSeq(
      (Map("1" -> 1.0, "2" -> 0.0),Map("out" -> 1.0)),
      (Map("1" -> 0.0, "2" -> 0.0),Map("out" -> 0.0)),
      (Map("1" -> 0.0, "2" -> 1.0),Map("out" -> 1.0)),
      (Map("1" -> 1.0, "2" -> 1.0),Map("out" -> 0.0)))
  
    val popSize = 1000      //The number of individuals in a generation
    val maxGen = 100        //number of generations

    //Anonymous type that extends from PagedGANN which is an implentation of 
    //GANN (Genetic Algorithm Neural Network) 
    val gann = new PagedGANN[WeightBiasGeneticCode,String,Perceptron[String]]() 
            with ErrorBasedTesting[WeightBiasGeneticCode,String,Perceptron[String]]
            with GAPerceptron[WeightBiasGeneticCode,String]{

       
      override def getTestData = xorTestData
      override val inputKeys = Seq("1","2");
      override val outputKeys = Seq("out");
      override val hiddenLayers = Seq(6,3);
      
      override val populationSize = popSize
       
      override def mutationRate:Double = { 0.25 }
      override def mutationSize:Double = {0.025 + 2.0 * (math.max(0.0,(50.0 - getGeneration)/1000.0)) }
      override def crossoverRate:Double = { 0.9 }
      override def elitistPercentile = {0.02}
      override def minNeuronOutput:Double = -0.1
      override def maxNeuronOutput:Double = 1.1
       
      override def concurrentPages = 4
      
      override def setupNetworkForIndividual(network:Perceptron[String],individual:WeightBiasGeneticCode){
        network.setWeights(individual.weights.geneSeq)
        network.setBiases(individual.biases.geneSeq)
      }
      
      override def stopCondition():Boolean = {
        val gen = getGeneration
        val topFive = getPopulation(0,5)
        val bottomFive = getPopulation(populationSize - 5)
        println(gen+" -> "+topFive.map(_._2)+" -> "+bottomFive.reverse.map(_._2))
        (gen >= maxGen || topFive.head._2 >= 1000000)
      }
      
      override def generateHiddenNeuronKey(layer:Int,index:Int):String = {
        "Hidden-"+layer+"-"+index;
      }
    }

    //Genetic Code is 2 chromosomes (1 for weights, 1 for biases)
    val initialPop = for(i <- 0 until popSize) yield {
      val wChrom = new ChromosomeDouble((0 until gann.weightsLength).map(i => 20.0 * math.random - 10.0).toIndexedSeq)
      val bChrom = new ChromosomeDouble((0 until gann.biasesLength).map(i => 2.0 * math.random - 1.0).toIndexedSeq)
      (new WeightBiasGeneticCode(wChrom,bChrom),0.0)
    }
     
    //Setup the genetic algorithm's initial population
    gann.initPopulation(initialPop,0)
    
    //Train the network
    val network = gann.trainNetwork()
    
    //Print the result
     println(network.toString)
     for(data <- xorTestData){
      println(data._1.toString+" -> "+network.calculate(data._1)("out"))
    }
  }
}

Once again, the important part is the anonymous type that extends from PagedGANN. This is an extension of GANN, which is the marriage between the genetic algorithm and the neural network. The PagedGANN can take advantage of machines with multiple processors to calculate fitness and create each new generation.

The various overridden defs tweak the genetic algorithm slightly. For instance, mutationRate determines how frequently an individual might be mutated while creating a new generation. Likewise, mutationAmount determines the maximum change of a network weight if it is mutated. Here's what gets printed at the end of the learning phase:


Map(1 -> 1.0, 2 -> 0.0) -> 1.0014781670169086
Map(1 -> 0.0, 2 -> 0.0) -> 2.588504471438824E-5
Map(1 -> 0.0, 2 -> 1.0) -> 0.9994488547053212
Map(1 -> 1.0, 2 -> 1.0) -> -2.3519634709978643E-5

And here's a graph showing the error versus generations for a few different executions.

For the most part, they all reach an acceptable aproximation of the XOR function. Some take longer than others and some reach a better solution, but the random nature of a genetic algorithm can help to avoid the local minimus problem. It should be noted that it's possible that a genetic algorithm might not solve the problem at all, and that it can take some tweaking of the parameters to get it to produce a good solution.

Next Time

The XOR function is well and good for explaining the basics, but it's also boring as hell. I've cooked up a better example of a neural network in action, which I'll be posting very soon.

Stay tuned!

Blog<T>

Friday, May 27, 2011

Blog<Programming> Neural Networks - Part 3

Sunday, May 8, 2011

Blog<Programming> Neural Networks - Part 2

Followers

Reborn