salga (English)

SALGA stands for System for Automated Learning based on Genetics Algorithms, and is a program developed in Python to simulate the evolution by natural selection of individuals in a population of solutions of a problem. It is oriented both to teaching and project development in this field, having been successfully used in several real industry optimization projects.

First steps

Installation

Run salga app

  • On windows, you must first run salga.pyc with double click.
  • On MacOSX and linux, open a terminal and type “python salga.pyc”. Depending on your Linux distribución perhaps you must install “python-tk”.
  • By default the fitness will be calculated in parallel using all available cores in the computer (not available in windows).
  • Running with the -s option will force a single kernel to be used. This is necessary on MS Windows systems due to a process management problem on this system.
  • If run with the -p option the evolution will be parallelized to use all cores using the concept of ecological niches.

Definition of fitness, alphabet and type of GA

By their own nature, Genetic Algorithms require, in order to be adapted to a particular problem, that a fitness function (or quality function) is defined indicating how good a particular individual is at solving a given problem. This fitness is domain-dependent and must be previously programmed so that the system can be applied to the specific problem.

Additionally, a phenotype function can be defined to obtain the individual to be evaluated from the chromosome that describes it. On the other hand, we must select the type of genetic algorithm to be used (classic, floating or permutation) and indicate the alphabet to be used in the problem.

About the type of Genetic Algorithm

The type of Genetic Algorithm to be used can be:

  • Classic
    • The alphabet is a finite set of symbols.
    • Mutation is performed at the gene level, choosing a new alphabet value.
    • Recombination is done by cutting the parent strings at a random point and gluing them together crosswise.
  • Floating
    • The alphabet is formed with all possible floating values in an interval [a,b] defined in the alphabet field.
    • The mutation takes a random uniform value in the range defined for that gene.
    • Recombination is performed by combining the genes of the parents using the BLX-alpha operator.
  • Permutation
    • The alphabet is a finite set of symbols.
    • A chromosome is formed by a particular permutation of the alphabet, i.e., the length of the chromosome is the length of the alphabet.
    • The mutation consists of one of three operators, ramdomly selected: (1) exchanging two genes, (2) delete and insert a gene, and (3) reverse some sequence of genes.
    • The pairing is constructed by copying a substring from parent A and filling in the rest with the missing values, taking them from parent B and maintaining the partial order, but not the position.
  • Variable
    • The alphabet is a finite set of symbols.
    • The length of the chromosome is variable.
    • Mutation and recombination are similar to classic.

About the alphabet

Depending on the type of Genetic Algorithm chosen, the alphabet is specified differently:

  • In ‘classic’, the alphabet is represented by a list of symbols separated by commas and enclosed in square brackets. Example: [0, 1] ➜ binary alphabet.
  • In ‘floating’, the alphabet is specified by the pair of values defining the interval, enclosed in square brackets. Example: [0, 10] ➜ all floating values in the interval [0, 10].
  • In permutation, the alphabet is represented by a list of symbols separated by commas and enclosed in square brackets, as in the ‘classic’ mode. Example: [1, 2, 3, 4, 5] ➜ permutations of the natural numbers 1 to 5.

About fitness

The fitness definition will be done in a separate python file that will be loaded later from the exit program. Let’s look at a simple example whose goal is to find the number x such that x^2 is 48867090902500:

import math
target = 488670902500

def phenotype (chromosome): # computes the decimal number represented by the chromosome
	res = 0
	l = len(chromosome)
	for v in range(l):
		if chromosome[l-v-1] != 0:
			res += 2**v
	return res

def fitness (chromosome): # aproximate x^2 = target
	fen = phenotype(chromosome)
	return 1.0 / (1.0 + abs(target - fen**2)) # maximum value when fen**2 == target!

In this code, the phenotype function receives a chromosome and returns the decimal number obtained from the genotype, i.e., it transforms, for example, «1 0 0 1» into the decimal value 5.

The fitness function returns a higher value the closer the number represented by a chromosome is to our objective. It has also been constructed so that its maximum possible value is 1.0, which is quite convenient for graphical representation and for automatically stopping the evolutionary process when the fitness of the best individual is 1.

It is possible to force values for other parameters when performing fitness loading, such as the alphabet, mutation and mating probabilities, chromosome length and the method to be used, among others. To force parameters from the code, values must be assigned to the global variable ‘parameters’, which is a dictionary of key-value pairs, being the possible values of the keys:

  • ‘alphabet’: list of alphabet symbols.
  • ‘type’: { ‘classic’, ‘floating’, ‘permutation’, ‘variable’, ‘seed’ }
  • ‘elitism’: boolean.
  • ‘memetic’: boolean.
  • ‘norm’: boolean.
  • ‘normfactor’: float.
  • ‘pmut’: float.
  • ‘pcross’: float.
  • ‘chromsize’: int.
  • ‘popsize’: int.
  • ‘target’: float.
  • ‘seed’: list.

Example:

parameters = { 'alphabet':[0, 100], 'type':'floating', 'elitism':True, 'norm':True, 'chromsize':3, 
'pmut':0.2, 'popsize':100 }

Meaning/Use of the different program options

Controls

  1. ‘Select the ‘type of genetics to be used’ ➜ can be classic, floating or permutation. Each type has its own interpretation of the alphabet and its own pairing and mutation operators.
  2. ‘Load Fitness Function’ ➜ loads the python code that defines the fitness and, if applicable, the phenotype.
  3. ‘Reset’ ➜ re-evaluate the fitness code (load and run the last fitness file).
  4. ‘Init’ ➜ creates the initial population with random values.
  5. ‘Learn’ ➜ start evolution.
  6. ‘Pmut’ ➜ mutation probability of a gene.
  7. ‘Pcross’ ➜ probability of recombine two individual after selected..
  8. ‘Elitism’ ➜ indicates whether elitism should be used. If so, the best individual of each generation is saved for the next generation, preventing its disappearance.
  9. ‘Norm’ ➜ indicates whether to use the rank method for selection, which is interesting to prevent evolution from stagnating if the fitness values of the individuals are very similar.
  10. ‘Memetic’ ➜ indicates use of memetic phase in the ‘floating’ type.
  11. ‘Best Imp.’ ➜ indicates use of memetic phase in the ‘floating’ type.
  12. ‘Steady’ ➜ uses replacement of the worst individuals in a Steady State strategy.
  13. ‘Norm. Factor’ ➜ when using the exponential rank method, indicates how intensely the values scale with respect to the relative quality of the individuals. Values close to 1 indicate a small scaling (the probability of selection of individuals is similar), while lower values make the scaling more intense.
  14. ‘Elitism size’ ➜ number of individuals to be saved in the next generation when elitism is activated.
  15. ‘Memetic steps’ ➜ number of steps to take when using memetics.
  16. ‘Tournament’ ➜ number of individuals chosen for the tournament mode. 0 means no tournament mode.
  17. ‘Chrom. size’ ➜ chromosome length to be used.
  18. ‘Population’ ➜ population size.
  19. ‘Ω’ ➜ alphabet.
  20. ‘Target fit.’ ➜ target fitness to finalize the process. It is recommended to define fitness so that the maximum possible target is 1.
  21. ‘Stalled in’ ➜ indicates that the training should be stopped if in that number of generations the quality of the best individual has not improved. If -1 the automatic stop is deactivated.
  22. ‘Trace period’ ➜ number of generations to update the graph and the information presented.
  23. ‘Load’ ➜ load a population.
  24. ‘Save’ ➜ save current population.
  25. ‘Export’ ➜ exports the current population in a text format that can be imported by other programs.
  26. ‘Help’ ➜ open this help.
  27. ‘About’ ➜ show program credits.
  28. ‘Quit’ ➜ end program.

Information presented

  1. ‘Best fitness’ ➜ fitness of the best individual in population.
  2. ‘Mean fitness’ ➜ mean population fitness.
  3. ‘Diversity ➜ population diversity.
  4. ‘Generation’ ➜ generations in the ongoing evolution.
  5. ‘Bst’ ➜ genotype of the best individual in the current generation, i.e., best solution to the problem so far.
  6. ‘En la parte inferior’  phenotype of the best individual in the current generation.

Procedure for generating a solution

  1. Select the Genetic Algorithm to use.
  2. Los fitness function by ‘Load fitness’.
  3. Define alphabet.
  4. Set parameters (Pmut, Pemp, etc.) or use default values.
  5. Click ‘Learn’.
  6. Training can be stopped manually by clicking on the ‘Stop’ option or automatically when the best individual in the population has a fitness equal to or greater than ‘Target fit’ or when the best individual has not improved as many generations as indicated by ‘Stalled in’.

Interpretation of training values

Line colors

  1. White: mean population fitness.
  2. Green: best individual fitness.
  3. Red: population diversity.
  4. Gray: subdivisions of the error in steps of 10% of the «target fitness», i.e. the point at which the evolution will stop.