As mentioned earlier, the genetic operators are independent of the problem to be solved. Let's implement all the components of the reproduction cycle. The fitness function and the encoding scheme are highly domain specific.
In accordance with the principles of object-oriented programming, the software architecture defines the genetic operators using a top-down approach: starting with the population, then each chromosome, and down to each gene.
The implementation of the genetic algorithm uses a design that is similar to the template for classifiers (refer to the Design template for classifier section in the Appendix A, Basic Concepts).
The key components of the implementation of the genetic algorithm are as follows:
Population
class defines the current set of solution candidates or chromosomes.GASolver
class implements the GA solver and has two components: a configuration object of the GAConfig
type and the initial population. This class implements an explicit monadic data transformation of the ETransform
type.GAConfig
configuration class consists of the GA execution and reproduction configuration parameters.Reproduction
type) controls the reproduction cycle between consecutive generations of chromosomes through the mate
method.GAMonitor
monitoring trait tracks the progress of the optimization and evaluates the exit condition for each reproduction cycle.The following UML class diagram describes the relation between the different components of the genetic algorithm:
Let's start with defining the key classes that control the genetic algorithm.
The Population
parameterized class (with the Gene
subtype) contains the set or pool of chromosomes. A population contains chromosomes that are a sequence or list of elements of the type inherited from Gene
. A Pool
is a mutable array used in order to avoid excessive duplication of the Chromosome
instances associated with immutable collections.
The case for mutability
It is a good Scala programming practice to stay away from mutable collections. However, in this case, the number of chromosomes can be very large. Most implementations of genetic algorithms update the population potentially three times per reproduction cycle, generating a large number of objects and taxing the Java garbage collector.
The Population
class takes two arguments:
limit
: This is the maximum size of the populationchromosomes
: This is the pool of chromosomes that define the current populationA reproduction cycle executes the following sequence of three genetic operators on a population: select
for the selection across all the chromosomes of the population (line 1
), +-
for crossover of all the chromosomes (line 2
), and ^
for the mutation of each chromosome (line 3
). Consider the following code:
type Pool[T <: Gene] = mutable.ArrayBuffer[Chromosome[T]] class Population[T <: Gene]( limit: Int, val chromosomes: Pool[T]) { def select(score: Chromosome[T]=>Unit, cutOff: Double) //1 def +- (xOver: Double) //2 def ^ (mu: Double) //3 … }
The limit
value specifies the maximum size of the population during optimization. It defines the hard limit or constraints on the population growth.
The chromosome is the second level of containment in the genotype hierarchy. The Chromosome
class takes a list of genes as parameter (code). The signature of the crossover and mutation methods, +-
and ^
, are similar to their implementation in the Population
class except for the fact that the crossover and mutable parameters are passed as indices relative to the list of genes and each gene. The section dedicated to the genetic crossover describes the GeneticIndices
class:
class Chromosome[T <: Gene](val code: List[T]) { var cost: Double = Random.nextDouble //4 def +- (that: Chromosome[T], idx: GeneticIndices): (Chromosome[T], Chromosome[T]) def ^ (idx: GeneticIndices): Chromosome[T] … }
The algorithm assigns the (un)fitting score or a cost
value to each chromosome to enable the ranking of chromosomes in the population, and ultimately, the selection of the fittest chromosomes (line 4
).
Finally, the reproduction process executes the genetic operators on each gene:
class Gene(val id: String, val target: Double, op: Operator) (implicit quantize: Quantization, encoding: Encoding){//5 lazy val bits: BitSet = apply(target, op) def apply(value: Double, op: Operator): BitSet //6 def unapply(bitSet: BitSet): (Double, Operator) //7 … def +- (index: Int, that: Gene): Gene //8 def ^ (index: Int): Unit //9 … }
The Gene
class takes three arguments and two implicit parameters, which are as follows:
id
: This is the identifier of the gene. It is usually the name of the variable represented by the gene.target
: This is the target value or threshold to be converted or discretized into a bit string.op
: This is the operator that is applied to the target value.quantize
: This is the quantization
or discretization
class that converts a double value to an integer to be converted into bits and vice versa (line 5
).encoding
: This is the encoding or bits layout of the gene as a pair of values and operators.The apply
method encodes a pair of value and operator into a bit set (line 6
). An unapply
method is the reverse operation of apply
. In this case, it decodes a bit set into a pair of value and operator (line 7
).
The implementation of the crossover (line 8
) and mutation (line 9
) operators on a gene is similar to the operations on the container chromosome.
The quantization is implemented as a case class:
case class Quantization(toInt: Double => Int, toDouble: Int => Double) { def this(R: Int) = this((x: Double) => (x*R).floor.toInt, (n: Int) => n/R) }
The first toInt
function converts a real value to an integer and toDouble
converts the integer back to a real value. The discretization
and inverse
functions are encapsulated into a class to reduce the risk of inconsistency between the two opposite conversion functions.
The instantiation of a gene converts the predicate representation into a bit string (bits of the java.util.BitSet
type) using the quantization function, Quantization.toInt
.
The layout of a gene is defined by the Encoding
class as follows:
class Encoding(nValueBits: Int, nOpBits: Int) {
val rValue = Range(0, nValueBits)
val length = nValueBits + nOpBits
val rOp = Range(nValueBits, length)
}
The Encoding
class specifies the bits layout of the gene as a number of bits, nValueBits
, to encode the value and the number of bits, nOpBits
, to encode the operator. The class defines the rValue
range for the value and the rOp
range for the operator. The client code has to be supplied to the implicit instance of the Encoding
class.
The bit set, bitset
, of the gene (encoding) is implemented by using the apply
method:
def apply(value: Double, op: Operator): BitSet = { val bitset = new BitSet(encoding.length) encoding.rOp foreach(i => //10 if(((op.id>>i) & 0x01)==0x01) bitset.set(i)) encoding.rValue foreach(i => //11 if( ((quant.toInt(value)>>i) & 0x01)==0x01) bitset.set(i)) bitset }
The bits layout of the gene is created using java.util.BitSet
. The op
operator is encoded first through its identifier, id
(line 10
). The value
is quantized by invoking the toInt
method and then encoded (line 11
).
The unapply
method decodes the gene from a bit set or bit string to a pair of values and operators. The method uses the quantization instance to cover bits into values and a convert
auxiliary function that is described along with its implementation in the source code, accompanying the book (line 12
):
def unapply(bits: BitSet): (Double, Operator) = (quant.toDouble(convert(encoding.rValue, bits)), op(convert(encoding.rOp, bits))) //12
The Operator
trait defines the signature of any operator. Each domain-specific problem requires a unique set of operations: Boolean, numeric, or string manipulation:
trait Operator {
def id: Int
def apply(id: Int): Operator
}
The preceding operator has two methods: an identifier id
and an apply
method that converts an index to an operator.
The first genetic operator of the reproduction cycle is the selection process. The select
method of the Population
class implements the steps of the selection phase to the population of chromosomes in the most efficient manner, as follows:
def select(score: Chromosome[T]=> Unit, cutOff: Double): Unit = { val cumul = chromosomes.map( _.cost).sum/SCALING_FACTOR //13 chromosomes foreach( _ /= cumul) //14 val _chromosomes = chromosomes.sortWith(_.cost < _.cost)//15 val cutOffSize = (cutOff*_chromosomes.size).floor.toInt //16 val popSize = if(limit < cutOffSize) limit else cutOffSize chromosomes.clear //17 chromosomes ++= _chromosomes.take(popSize) //18 }
The select
method computes the cumul
cumulative sum of the cost
(line 13
) for the entire population. It normalizes the cost of each chromosome (line 14
), orders the population by decreasing the value (line 15
), and applies a cutOff
soft limit function on the population growth (line 16
). The next step reduces the size of the population to the lowest of the two limits: the hard limit, limit
, or the soft limit, cutOffSize
. Finally, the existing chromosomes are cleared (line 17
) and updated with the next generation (line 18
).
The score
scoring function takes a chromosome as a parameter and returns the cost
value for this chromosome.
The natural selection process controls or manages the growth of the population of species. The genetic algorithm uses the following two mechanisms:
cutOff
value used during selection (the select
method).The cutoff
value is computed using a softLimit
user-defined function of the Int => Double
type, which is provided as a configuration parameter (softLimit(cycle: Int) => a.cycle +b
).
The four configurations and tuning parameters required by the genetic algorithm are as follows:
xOver
: This is the crossover ratio (or probability) and has a value in the interval [0, 1]mu
: This is the mutation ratiomaxCycles
: This is the maximum number of reproduction cyclessoftLimit
: This is the soft constraint on the population growthConsider the following code:
class GAConfig(val xover: Double, val mu: Double, val maxCycles: Int, val softLimit: Int => Double) extends Config { val mutation = (cycle : Int) => softLimit(cycle) }
As mentioned earlier, the genetic crossover operator couples two chromosomes to generate two offspring chromosomes that compete with all the other chromosomes in the population, including their own parents, in the selection phase of the next reproduction cycle.
We use the +-
notation as the implementation of the crossover operator in Scala. There are several options to select pairs of chromosomes for crossover. This implementation ranks the chromosomes by their fitness (or inverse cost
) value and then divides the population into two halves. Finally, it pairs the chromosomes of identical rank from each half, as illustrated in the following diagram:
The crossover implementation, +-
, selects the parent chromosome candidates for crossover using the pairing scheme described earlier. Consider the following code:
def +- (xOver: Double): Unit = if( size > 1) { val mid = size>>1 val bottom = chromosomes.slice(mid, size) //19 val gIdx = geneticIndices(xOver) //20 val offSprings = chromosomes.take(mid).zip(bottom) .map{ case (t, b) => t +- (b, gIdx) }.unzip //21 chromosomes ++= offSprings._1 ++ offSprings._2 //22 }
This method splits the population into two subpopulations of equal size (line 19
) and applies the Scala zip
and unzip
methods to generate the set of pairs of offspring chromosomes (line 20
). The +-
crossover operator is applied to each chromosome pair to produce an array of pairs of offSprings
(line 21
). Finally, the crossover
method adds offspring chromosomes to the existing population (line 22
). The xOver
crossover value is a probability randomly generated over the interval [config.xOver
, 1].
The GeneticIndices
case class defines two indices of the bit whenever a crossover or a mutation occurs. The first chOpIdx
index is the absolute index of the bit affected by the genetic operation in the chromosome (line 23
). The second geneOpIdx
index is the index of the bit within the gene subjected to crossover or mutation (line 24
):
case class GeneticIndices(val chOpIdx: Int, //23 val geneOpIdx: Int) //24
The geneticIndices
method computes the relative indices of the crossover bit in the chromosomes and genes:
def geneticIndices(prob: Double): GeneticIndices = { var idx = (prob*chromosomeSize).floor.toInt //25 val chIdx = if(idx == chromosomeSize) chromosomeSize-1 else idx //25 idx = (prob*geneSize).floor.toInt val gIdx = if(idx == geneSize) geneSize-1 else idx //26 GeneticIndices(chIdx, gIdx) }
The first chIdx
indexer is the index or rank of the gene within the chromosome to be affected by the genetic operator (line 25
). The second gIdx
indexer is the relative index of the bit within the gene (line 26
).
Let's consider a chromosome composed of 2 genes with 63 bits/elements each, as illustrated in the following diagram:
The geneticIndices
method computes the following:
chIdx
index of the gene within the chromosome and the gIdx
index of the bit within the genechIdx
index (that is the second gene) to be alteredgIdx
index (that is chIdx*64 + gIdx)First, we need to define the Chromosome
class, which takes a list of genes, code
, (for genetic code) as the parameter:
val QUANT = 500 class Chromosome[T <: Gene](val code: List[T]) { var cost: Double = QUANT*(1.0 + Random.nextDouble) //27 def +- (that: Chromosome[T], indices: GeneticIndices): //28 (Chromosome[T], Chromosome[T]) def ^ (indices: GeneticIndices): Chromosome[T] //29 def /= (normalizeFactor: Double): Unit = //30 cost /= normalizeFactor def decode(implicit d: Gene=>T): List[T] = //31 code.map( d(_)) … }
The cost (or unfitness) of a chromosome is initialized as a random value between QUANT
and 2*QUANT
(line 27
). The genetic +-
crossover operator generates a pair of two offspring chromosomes (line 28
). The genetic ^
mutation operator creates a slightly modified (1 or 2 bits) clone of this chromosome (line 29
). The /=
method normalizes the cost of the chromosome (line 30
). The decode
method converts the gene to a logic predicate or rule using an implicit conversion, d
, between a gene and its subclass (line 31
).
The implementation of the crossover for a pair of chromosomes using hierarchical encoding follows two steps:
indices.chOpIdx
crossover index and then swap the remaining genes.xoverIdx
.Consider the following code:
def +- (that: Chromosome[T], indices: GeneticIndices): (Chromosome[T], Chromosome[T]) = { val xoverIdx = indices.chOpIdx //32 val xGenes = spliceGene(indices, that.code(xoverIdx)) //33 val offSprng1 = code.slice(0, xoverIdx) ::: xGenes._1 :: that.code.drop(xoverIdx+1) //34 val offSprng2 = that.code.slice(0, xoverIdx) ::: xGenes._2 :: code.drop(xoverIdx+1) (Chromosome[T](offSprng1), Chromosome[T](offSprng2)) //35 }
The crossover method computes the index xoverIdx
of the bit that defines the crossover in each parent chromosome (line 32
). The this.code(xoverIdx)
and that.code(xoverIdx)
genes are swapped and spliced by the spliceGene
method to generate a spliced gene (line 33
):
def spliceGene(indices: GeneticIndices, thatCode: T): (T,T) ={
((this.code(indices.chOpIdx) +- (thatCode,indices)),
(thatCode +- (code(indices.chOpIdx),indices)) )
}
The offspring chromosomes are gathered by collating the first xOverIdx
genes of the parent chromosome, the crossover gene, and the remaining genes of the other parent (line 34
). The method returns the pair of offspring chromosomes (line 35
).
The crossover is applied to a gene using the +-
method of the Gene
class. The exchange of bits between the this
and that
genes uses the BitSet
Java class to rearrange the bits after the permutation:
def +- (that: Gene, indices: GeneticIndices): Gene = { val clonedBits = cloneBits(bits) //36 Range(indices.geneOpIdx, bits.size).foreach(n => if( that.bits.get(n) ) clonedBits.set(n) else clonedBits.clear(n) //37 ) val valOp = decode(clonedBits) //38 new Gene(id, valOp._1, valOp._2) }
The bits of the gene are cloned (line 36
) and then spliced by exchanging their bits along with the indices.geneOpIdx
crossover point (line 37
). The cloneBits
function duplicates a bit string, which is then converted into a (target value, operator) tuple using the decode
method (line 38
). We omit these two methods because they are not critical to the understanding of the algorithm.
The mutation of the population uses the same algorithmic approach as the crossover operation.
The ^
mutation operator invokes the same operator for all the chromosomes in the population and then adds the mutated chromosomes to the existing population, so that they can compete with the original chromosomes. We use the ^
notation to define the mutation operator to remind you that the mutation is implemented by flipping one bit:
def ^ (prob: Double): Unit =
chromosomes ++= chromosomes.map(_ ^ geneticIndices(prob))
The prob
mutation parameter is used to compute the absolute index of the mutating gene, geneticIndices(prob)
.
The implementation of the ^
mutation operator on a chromosome consists of mutating the gene of the indices.chOpIdx
index (line 39
) and then updating the list of genes in the chromosome (line 40
). The method returns a new chromosome (line 41
) that will compete with the original chromosome:
def ^ (indices: GeneticIndices): Chromosome[T] = { //39 val mutated = code(indices.chOpIdx) ^ indices val xs = Range(0, code.size).map(i => if(i== indices.chOpIdx) mutated else code(i)).toList //40 Chromosome[T](xs) //41 }
Finally, the mutation operator flips (XOR) the bit at the indices.geneOpIdx
index:
def ^ (indices: GeneticIndices): Gene = { val idx = indices.geneOpIdx val clonedBits = cloneBits(bits) //42 clonedBits.flip(idx) //43 val valOp = decode(clonedBits) //44 new Gene(id, valOp._1, valOp._2) //45 }
The ^
method mutates the cloned bit string, clonedBits
, (line 42
) by flipping the bit at the indices.geneOpIdx
index (line 43
). It decodes and converts the mutated bit string by converting it into a (target value, operator) tuple (line 44
). The last step creates a new gene from the target-operator tuple (line 45
).
Let's wrap the reproduction cycle into a Reproduction
class that uses the scoring function, score
:
class Reproduction[T <: Gene](score: Chromosome[T] => Unit)
The mate
reproduction function implements the sequence or workflow of the three genetic operators: select
for the selection, +-
(xover) for the crossover, and ^
(mu) for the mutation:
def mate(population: Population[T], config: GAConfig, cycle: Int): Boolean = (population.size: @switch) match { case 0 | 1 | 2 => false //46 case _ => { rand.setSeed(rand.nextInt + System.currentTimeMillis) population.select(score, config.softLimit(cycle)) //47 population +- rand.nextDouble*config.xover //48 population ^ rand.nextDouble*config.mu //49 true } }
The mate
method returns false (that is, the reproduction cycle aborts) if the population size is less than 3 (line 46
). The chromosomes in the current population are ranked by the increasing cost. The chromosomes with the high cost or low fitness are discarded to comply with the soft limit, softLimit
, on the population growth (line 47
). The randomly generated probability is used as an input to the crossover operation on the entire remaining population (line 48
) and as an input to the mutation of the remaining population (line 49
):
The GASolver
class manages the reproduction cycles and the population of chromosomes. The solver is defined as a data transformation of the ETransform
type using an explicit configuration of the GAConfig
type, as described in the Monadic data transformation section in Chapter 2, Hello World! (line 50
).
The GASolver
class implements the GAMonitor
trait to monitor the population diversity, manage the reproduction cycle, and control the convergence of the optimizer (line 51
).
The genetic algorithm-based solver has the following three arguments:
config
: This is the configuration of the execution of the genetic algorithmscore
: This is the scoring function of a chromosometracker
: This is the optional tracking function to initialize the monitoring function of GAMonitor
The code will be as follows:
class GASolver[T <: Gene](config: GAConfig, score: Chromosome[T] => Unit, tracker: Option[Population[T] => Unit] = None) extends ETransform[GAConfig](config) //50 with GAMonitor[T] { //51 type U = Population[T] //52 type V = Population[T] //53 val monitor: Option[Population[T] => Unit] = tracker def |>(initialize: => Population[T]): Try[Population[T]] = this.|> (initialize()) //54 override def |> : PartialFunction[U, Try[V]] //55 }
This explicit data transformation has to initialize the U
type of an input element (line 52
) and the V
type of an output element (line 53
) for the prediction or optimization method, |>
. The optimizer takes an initial population as the input and generates a very small population of the fittest chromosomes from which the best solution is extracted (line 55
).
The population is generated by the |>
method ( => Population[T])
that takes the constructor of the Population
class as an argument (line 54
).
Let's briefly take a look at the GAMonitor
monitoring trait assigned to the genetic algorithm. The trait has the following two attributes:
monitor
: This is an abstract value to be initialized by classes that implement this trait (line 55
).state
: This is the current state of the execution of the genetic algorithm. The initial state of the genetic algorithm is GA_NOT_RUNNING
(line 56
).The code will be as follows:
trait GAMonitor[T <: Gene] extends Monitor { self: { def |> :PartialFunction[Population[T],Try[Population[T]]] } => //55 val monitor: Option[Population[T] => Unit] //56 var state: GAState = GA_NOT_RUNNING //57 def isReady: Boolean = state == GA_NOT_RUNNING def start: Unit = state = GA_RUNNING def isComplete(population: Population[T], remainingCycles: Int): Boolean = { … } //58 }
The state
of the genetic algorithm can only be updated in the |>
method through an instance of the GAMonitor
class. (line 55
).
Here is a subset of the possible state of the execution of the genetic algorithm:
sealed abstract class GAState(description: String) case class GA_FAILED(description: String) extends GAState(description) object GA_RUNNING extends GAState("Running")
The solver invokes the isComplete
method to test the convergence of the optimizer at each reproduction cycle (line 58
).
There are two options for estimating that the reproducing cycle is converging:
Let's consider the following implementation of the genetic algorithm solver:
override def |> : PartialFunction[U, Try[V]] = { case population: U if(population.size > 1 && isReady) => { start //59 val reproduction = Reproduction[T](score) //60 @tailrec def reproduce(population: Population[T], n:Int): Population[T] = { //61 if( !reproduction.mate(population, config, n) || isComplete(population, config.maxCycles -n) ) population else reproduce(population, n+1) } reproduce(population, 0) population.select(score, 1.0) //62 Try(population) } }
The optimizing method initializes the state of execution (line 59
) and the components of the reproduction
cycle (line 60
). The reproduction cycle (or an epoch) is implemented as a tail recursion that tests whether the last reproduction cycle has failed or whether the optimization has converged toward a solution (line 61
). Finally, the remaining fittest chromosomes are reordered by invoking the select
method of the Population
class (line 62
).