Genetic Algorithms

Suggest edits
Documentation > Advanced Concepts

Content:



The various methods available in OpenMOLE make an extensive use of genetic algorithms (GA). For instance, it is the case for the model calibration method, which is an optimization problem, or the search for output diversity with the PSE method, which boils down to a GA with a novelty incentive.
GAs can be smartly distributed on grid environments using an island scheme, and are able to deal with stochastic models.

About Calibration and GA 🔗

OpenMOLE provides advanced methods to help you calibrate your model. These methods automatically generate workflows to explore the parameter space of your model towards the "best" parameter set, according to a previously defined criterion or objective. This is commonly addressed in the literature as a calibration, or optimization, problem.
The different calibration methods in OpenMOLE use GAs to explore the parameter space of a simulation model, looking for parameter sets that will produce outputs reaching one or several given objectives. Objectives functions, also called fitness functions, compute quantities from the model outputs that have to be minimized or maximized. They are a quantification of the optimal model output you are looking for.
A common optimization problem is data fitting. In this particular case, the objective function could compute the distance between simulation results and data points, a classical example is the Squared Error function. If you want your model to reproduce several characteristics (sometimes called stylised facts), you need several objectives, each of them quantifying the similarity between your model outputs and the characteristics you want to reproduce.
To calibrate your model, you need to define:
  • the genome of your model, i.e. the parameters to be calibrated. They are the dimensions of the parameter space that will be explored by the GA. The GA will try different genomes, and keep the best one discovered yet.
  • the objectives you want to reach, expressed as variables to be minimized.
  • a termination criterion, to stop the method eventually.

Dummy Model Optimization Example 🔗

This workflow optimizes a dummy model using the generational NSGA-II multi-objective algorithm. You can replace the instances of model by your own model, and adapt the variation range of the input variables. If you are not familiar with parameter tuning using GA, you should first consult the tutorial explaining how to calibrate a NetLogo model with a GA.
  // model inputs
  val x = Val[Double]
  val y = Val[Double]
  val s = Val[String]

  // model outputs
  val o1 = Val[Double]
  val o2 = Val[Double]

  val model = ScalaTask("""
    val o1 = x
    val o2 = s match {
      case "apple" => y
      case "banana" => -y
      case "strawberry" => -2 * y
    }""") set (
      inputs += (x, y, s),
      outputs += (o1, o2)
    )

// Construction of the workflow orchestrating the genetic algorithm
// genome is the inputs prototype and their variation ranges
// objective sets the objectives to minimize
// termination is the termination criterion, here it runs for 100 generations. A time limit could be set as an
// alternative by replacing 100 by 1 hour (hour is a duration type understood by OpenMOLE).
// the parallelism specifies how many evaluation are concurrently submitted to the execution environment
val evolution =
  NSGA2Evolution(
    genome = Seq(
      x in (0.0, 1.0),
      y in (0.0, 1.0),
      s in List("apple", "banana", "strawberry")),
    objective = Seq(o1, o2),
    evaluation = model,
    parallelism = 10,
    termination = 100
  )

// Construction of the complete workflow with the execution environment, and the hook.
// A hook is attached to save the population of solutions to  workDirectory /evolution on the local machine running OpenMOLE
// Here the generated workflow will run using 4 threads of the local machine.
evolution hook (workDirectory / "evolution") on LocalEnvironment(4)
Note that the objectives are given as a sequence of model outputs variables to be minimized. So if you want to reach specific target values, like Pi and 42, you can use the delta keyword:
  // model inputs
  val x = Val[Double]
  val y = Val[Double]
  val s = Val[String]

  // model outputs
  val o1 = Val[Double]
  val o2 = Val[Double]

  val model = ScalaTask("""
    val o1 = x
    val o2 = s match {
      case "apple" => y
      case "banana" => -y
      case "strawberry" => -2 * y
    }""") set (
      inputs += (x, y, s),
      outputs += (o1, o2)
    )


NSGA2Evolution(
  genome = Seq(
    x in (0.0, 1.0),
    y in (0.0, 1.0),
    s in List("apple", "banana", "strawberry")),
  objective = Seq(o1 delta math.Pi, o2 delta 42),
  evaluation = model,
  parallelism = 10,
  termination = 100
) hook (workDirectory / "evolution")
NB: in this case the results in the saved file will be the difference between the outputs of the model and your objectives.
Obviously, maximization problems are performed by taking the opposite of variables as objectives. You may use a - keyword to minimise the opposite of o1 (i.e. maximize o1).
  // model inputs
  val x = Val[Double]
  val y = Val[Double]
  val s = Val[String]

  // model outputs
  val o1 = Val[Double]
  val o2 = Val[Double]

  val model = ScalaTask("""
    val o1 = x
    val o2 = s match {
      case "apple" => y
      case "banana" => -y
      case "strawberry" => -2 * y
    }""") set (
      inputs += (x, y, s),
      outputs += (o1, o2)
    )


val maximize = ScalaTask("o1 = -o1") set ((inputs, outputs) += (o1, o2))


NSGA2Evolution(
  genome = Seq(
    x in (0.0, 1.0),
    y in (0.0, 1.0),
    s in List("apple", "banana", "strawberry")),
  objective = Seq(-o1, o2),
  evaluation = model,
  parallelism = 10,
  termination = 100) hook (workDirectory / "evolution")

Outputs of NSGA2Evolution 🔗

As an output, the method produces a population file for each generation, in the directory provided to the hook, named with the generation number as populationN.csv. Each csv file contains a column with the generation number, the values of parameters, the median value of the objectives at each point, and in the variable evolution$samples, the number of runs of the model used for the evaluation (in the case of stochastic models).

Real world Example 🔗

This tutorial exposes how to use Genetic Algorithms to perform optimization on a NetLogo model.