AALTO UNIVERSITY
School of Science and Technology
Faculty of Information and Natural Sciences
Department of Mathematics and Systems Analysis
Mat-2.4108 Independent research projects in applied mathematics
Comparison of Multiobjective Simulated Annealing and
Multiobjective Evolutionary Algorithms in Presence of Noise
Ossi Koivistoinen
May 24, 2010
ii
Contents
1 Introduction
1.1 Multiobjective optimization . . . . . . . . . . . . . . . . . . . . .
1.2 Simulated annealing . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 This study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
1
1
2 Multiobjective optimization algorithms for noisy objective functions
2.1 MOEA-RF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Basic MOEA . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.3 Noise handling . . . . . . . . . . . . . . . . . . . . . . . .
2.2 PMOSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Evaluating candidate solutions . . . . . . . . . . . . . . .
2.2.2 Maintaining archive . . . . . . . . . . . . . . . . . . . . .
2.2.3 Generating new candidate solutions . . . . . . . . . . . .
2.2.4 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
2
3
4
6
6
6
7
7
3 Test setup
3.1 Noise . . . . . . . . .
3.2 Benchmark problems
3.3 Performance metrics
3.4 Experiments . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
9
10
11
4 Results
12
4.1 Parameters for PMOSA . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Speed of convergence . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3 Quality of solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Summary
21
Appendices
24
A Results of all experiments
24
B Implementation of
B.1 FON.m . . . .
B.2 KUR.m . . . .
B.3 DTLZ2B.m . .
the benchmark problems
33
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
C Implementation of MOEA-RF algorithm
C.1 moearf.m . . . . . . . . . . . . . . . . . .
C.2 crossover.m . . . . . . . . . . . . . . . . .
C.3 mutate.m . . . . . . . . . . . . . . . . . .
C.4 tournament selection.m . . . . . . . . . .
C.5 diversity operator.m . . . . . . . . . . . .
C.6 find pareto front possibilistic.m . . . . . .
C.7 random bit matrix.m . . . . . . . . . . . .
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
34
34
36
37
37
37
38
38
D Implementation of
D.1 sa2ba.m . . . .
D.2 gen.m . . . . .
D.3 pdom.m . . . .
D.4 laprnd.m . . . .
PMOSA
. . . . . .
. . . . . .
. . . . . .
. . . . . .
algorithm
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
39
39
41
42
42
E Implementation of metrics
43
E.1 stats.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
iv
1
Introduction
Multiobjective optimization in presence of noise is a topic of interest in realworld optimization problems and simulations and Multiobjective Evolutionary
algorithms (MOEAs) are an established approach in solving multiobjective optimization problems [1]. Multiobjective Simulated Annealing (MOSA) algorithms
[2, 3] are a less established family of algorithms for multi-objective optimization
that show promising result compared to MOEAs [4, 5]. In this study the performance of a new Probabilistic MOSA (PMOSA) algorithm [6] is compared to
state-of-the-art MOEA algorithm [4, 7] in a noisy environment.
1.1
Multiobjective optimization
In multiobjective optimization instead of single objective function, the target is
to minimize m objective functions f1 (x) . . . fm (x) independently for the same
set of decision variables x. A solution x is dominated if there is some other
solution x̂ such that fi (x̂) ≤ fi (x) for all i ∈ m and there exists i ∈ m such
that fi (x̂) < fi (x). In this case x̂ dominates x, x̂ ≺ x. The optimal solution to
the optimization problem consists of a set of non-dominated solutions, which is
known as the Pareto set. [1]
1.2
Simulated annealing
Simulated annealing (SA) is compact and robust metaheuristic inspired by physical concept of annealing in metallurgy. An SA algorithm operates on a solution
candidate that is repeatedly modified in order to find the optimal solution. At
first, the variations in the solution candidate are larger but the movement gradually decreases during the process. This is analogous to movement of an atom
in a cooling metal alloy. The magnitude of changes in the solution candidate
are controlled by temperature T [3].
With correct annealing schedule the simulated annealing algorithm has the
ability to escape local optima and to converge to the global optimum [8].
Recent successful applications for simulated annealing include for example
clustering gene expression data in field of bioinformatics [9], performing atmospheric correction on hyperspectral images [10] and optimization tasks in
time-critical online applications [11].
1.3
This study
Simulated annealing can match the performance of evolutionary algorithms in
deterministic multiobjective optimization [5, 6]. In this study a version of established multiobjective evolutionary algorithm MOEA-RF [7] is implemented
and its performance compared to a proposed probabilistic multi-objective simulated annealing (PMOSA) algorithm. In previous studies MOEA-RF has been
identified as superior algoritm compared to other MOEAs and it can effieciently
solve multiobjective optimization problems [7]. Therefore it is a good candidate
for comparison. The comparison involves three benchmark problems with normally distributed noise. Four performance metrics are utilized to compare the
algorithms.
1
2
Multiobjective optimization algorithms for noisy
objective functions
2.1
MOEA-RF
MOEA-RF is a basic multiobjective evolutionary algorithm (MOEA) augmented
with improvements and optimizations to improve its performance in noisy environment. In this study a slightly simplified MOEA-RF algorithm is implemented
with MATLAB based on description in article by C. K. Goh and K. C. Tan [7].
The MOEA-RF algorithm is presented in table 1. The parameters for the
algorithm are presented in table 2. In some instances this algorithm is simply
referred as “EA”.
2.1.1
Basic MOEA
The basic MOEA consists of steps normally found in a genetic algorithm: initialization, evaluation, selection, crossover, and mutation. A genetic algorithm
maintains an evolving population of solution candidates [12]. The difference
of MOEA compared to a single objective evolutionary algorithm is that there
are more than one objective functions and the mutual performance of solution
candidates is described with their Pareto ranking. Pareto ranking is defined for
each solution candidate, s∗ ∈ S, as the number of solutions, s ∈ S \ s∗, that are
dominated by it.
Population consists of solution candidates that have a genotype that is their
internal representation for the evolutionary algorithm. In this implementation, the genotype of a solution candidate consists of n positive integers g =
[g1 , . . . , gn ] represented as a binary integers with length of B bits.
gi = gB gB−1 . . . g1 g0 , gb ∈ {0, 1}
where
gn =
B
X
gb 2b
(1)
(2)
b=0
The phenotype of a solution candidate is a vector of n decision variables in
the range specific to the optimization problem.
x = [x1 , . . . , xn ], xi ∈ [ximin , ximax ], ∀i ∈ n
(3)
gi (xi ) = (ximax − ximin )(xi − ximin )
(5)
There are functions f : N+ n → Rn and its inverse f −1 : Rn → N+ n to convert
the genotype phenotype and vice versa. A linear translation between genotypes
and phenotypes is used:
gi
(4)
+ ximin
xi (gi ) =
ximax − ximin
The genotype of the population is presented as

g1,1 . . . g1,n
 ..
..
Pg =  .
.
gm,1
. . . gm,n
2
a population matrix.



(6)
Initialization is performed by selecting population P from uniform probability distribution. Empty archive A is created.
Evaluation is done by applying the objective functions of the benchmark
problems fobj : Rn → Rm to produce the objective function values from the
phenotypes of decision variables.
Selection is done using tournament selection [12]. Population is distributed
to random pairs such that each solution candidate pariticipates in exactly two
tournaments. The solution candidate that has better objective function value
in the tournament pair is preserved in the popopulation.
Uniform bitwise crossover algorithm (x, y) ∈ N+ → (x̂, ŷ) ∈ N+ takes
binary representations (eq. 1) of two integers:
x =
xB . . . x0
(7)
y
yB . . . y0
(8)
=
For each bit b ∈ [0, B] choose continuous random pb ∈ [0, 1].
pb < 0.5 : x̂b = xb
pb ≥ 0.5 : x̂b = yb
and ŷb = yb
and ŷb = xb .
(9)
This gives the two integers after the crossover operation:
x̂ =
x̂B . . . x̂0
(10)
ŷ
ŷB . . . ŷ0 .
(11)
=
To perform the crossover random pairs are selected from the population so that
each solution candidate participates in one crossover. The crossover is performed
on the pair with crossover probability pc .
Mutation is used to introduce new genes into the population. The basic
mutation operation independently flips each bit of each genotype element of
each solution candidate with mutation probability pm .
2.1.2
Optimization
Two improvements that are used in contemporary MOEAs, elitism and diversity operator, are first introduced to the basic MOEA. Elitism is achieved by
maintaining an archive A of best found solutions and combining members of the
archive with the population on every generation. Archive has a maximum size
and when it is reached the most crowded archive members are removed by using
a recurrent truncation process based on niche count. Radius proposed by [7] for
the niche count (1/sizepopulation in the normalized objective space) was found
to be the best alternative. The purpose of elitism is to avoid the population
escaping the Pareto front too much and the purpose of the recurrent truncation
process is to maintain maximum diversity in the archive.
3
2.1.3
Noise handling
In addition to basic improvements, three noise handling techniques are included in the algorithm. These are Experiential Learning Directed Perturbation (ELDP), Gene Adaptation Selection Strategy (GASS), and Necessity (N-)
archiving model. They are well explained by Goh and Tan in [7]. In this study
the same parameters are used for the noise handling techniques that are used
by Goh and Tan. Necessity possibilistic (NP-) archiving model, that was also
described in the article, was not implemented in this study.
Experiential Learning Directed Perturbation (ELDP) monitors the
variations in components of decision variables in subsequent generations. If
the variation has remained below certain threshold level in last two generations
the same change in decision variable phenotype space is applied to the element
instead of random mutation. This should accelerate the convergence of the
algorithm.
Gene Adaptation Selection Strategy (GASS) is an alternative mutation
algorithm that manipulates the population in the phenotype space in order to
make the population show desired divergence or convergence characteristics.
Minimum simin , maximum simax , and mean values simean for each dimension
i ∈ n in phenotype space are calulated for solutions candidates in the archive
and intervals are constructed form these for convergence model
Ai
=
si min − wsi mean
(12)
Bi
=
si min + wsi mean
(13)
= si mean − wsi mean
(14)
and for divergence model
ai
bi
= simean + wsimean
(15)
After mutation and crossover operations, GASS divergence or convergence model
is applied to the population. Each phenotype component of each solution candidate is re-selected with probability 1/n from uniform probablity distribution
U (Ai , Bi ) for convergence model and from U (ai , bi ) for divergence model. Convergence model is triggered after 150 generations if 60% of the archive capacity
is reached and divergence model if less than 60% of archive is filled.
Necessity (N-) archiving model adds fuzzy threshold for determining the
members of Pareto set to compensate the noise found in objective function
values. Instead of Pareto dominance, N-dominance is used to identify the dominated solutions. The defintion for N-dominance is given in [7]. In essence a
fuzzy margin depending of the noise level is added between the found Pareto
front and the objective function values that must be exceeded for the new solution to N-dominate the old solutions.
4
Table 1 — MOEA-RF algorithm used in this study.
Step
1. Initialize
2. Evaluate
3. Selection
4. Use archive
5. Crossover
6. Update archive
7. Diversity operator
8. Mutate
9. Repeat
10. End
Description
Select population P from uniform probability distribution and create empty archive A.
Calculate objective objective function values fi (P )
Perform tournament selection on population.
Replace n random members of P by random
members of A, if there are enough members
in A.
Apply uniform crossover operator on population. Pnew = crossover(Pold ).
1. Combine current archive with the population Atmp = Aold ∪ P
2. Apply N-archiving model to Atmp to find
the possibly non-dominated solutions
Limit the size of archive to Amax by applying
the recursive truncation operator on it.
1. Apply ELDP mutation algorithm to P .
2. Apply GASS algorithm on P .
Repeat from step 2 until g = gmax
Return archive A.
Table 2 — Parameters for the MOEA-RF algorithm
Parameter
Number of samples
Number of generations
Population size
Archive size
Mutation
Selection
Crossover
Value
L=1
gmax = 500
S = 100
A = 100
bitwise mutation with probability pm = 1/15
plus ELDP plus GASS
tournament selection + random swap of 10 solutions from archive to population.
Uniform crossover with probability pc = 0.5
5
2.2
PMOSA
The probabilistic multiobjective simulated annealing (PMOSA) algorithm that
is evaluated in this study is described and implemented by Mattila et al. in
[6]. It is based on simulated annealing [2] and the Archiving MOSA (AMOSA)
algorithm [4]. Unlike previous MOSAs, this algorithm uses probabilistic ranking to select the non-dominated solutions. Stochastic ranking is used also by
MOEA-RF in form of N- and NP-archiving models, but unlike computationally
light MOEA-RF, PMOSA uses probability of dominance for ranking.
The PMOSA algorithm is presented in table 3. Parameters for the algorithm
are presented in table 4. In some instances this algorithm is simply referred as
“SA”.
2.2.1
Evaluating candidate solutions
The PMOSA algorithm processes one solution candidate at a time, called the
current solution x̃. On each iteration round a new solution x́ is generated
in the neighborhood of x̃ (see chapter 2.2.3). Objective functions fi are then
evaluated r times for both solutions. A single scalar performance measure F (x)
is constructed with respect to the current archive of accepted solutions S based
on probability of dominance P (x ≺ y), where x and y are two solutions.
The estimates for the mean f¯i (x) and variance s2i (x) are obtained from r
evaluations of fi (x). It is assumed that the noise in fi (x) is normally distributed,
thus the probability P (fi (x) < fi (y)) is given by evaluating the cumulative
distribution function of the normal distribution N (f¯i (x) − f¯i (y), s2i (x) + s2i (y))
at the origin.
The probability of dominance is then given
P (x ≺ y) =
m
Y
P (fi (x) < fi (y))
(16)
i=0
Using eq 16 the performance measures for the current and the new solution
are:
X
P (x ≺ x̃)
(17)
Fx̃ =
x ∈ S \ {x̃} ∪ {x́}
Fx́
=
X
x ∈ S ∪ {x̃}
P (x ≺ x́)
(18)
The probability of accepting x̃ as the current solution is dependent of the
temperature T and is calculated as
p = exp(−(F (x́) − F (x̃))/T )
(19)
On each iteration, if F (x́) < F (x̃), accept the new solution as current solution x̃ = x́. Otherwise accept new solution with probability p.
2.2.2
Maintaining archive
A fixed size archive S of n solutions with lowest probability of dominance is
maintained during the iteration and it is returned as the result in the end.
6
In the beginning of the algorithm the archive is initialized to contain the first
randomly generated solution candidate. As long as the maximum size N is
not reached each newly accepted current solution is added to the archive. The
performance measure for the solutions in the archive is
X
Fx́S =
P (x ≺ x́)
(20)
x∈S
After the maximum archive size is reached archive performance measures are
calculated on each iteration round for the set S ∪ {x̃} and the element with the
highest value for eq. 20 is dropped from the archive. This operation is optimized
by storing the mutual probabilities of dominance between the members of the
archive.
Updating the archive is the computationally most expensive part of the algorithm and it leads to noticeable slowing of the algorithm when the archive
size is large. This is justified by PMOSA’s background in multiobjective simulation optimization (MOSO), in which the simulation run is computationally
much more expensive than the optimization algorithm itself.
2.2.3
Generating new candidate solutions
In a MOSA algorithm one solution candidate is processed at a time and it is
perturbed in the decision variable space before every evaluation step by
xnew = x + ∆x
(21)
Selection of the procedure that is used to generate the new candidate solutions
can have significant effect on the performance of the algorithm [6].
Three different perturbation methods were tested. In the first method, all
components of the decision variable are modified by drawing the perturbations
from uniform probability distribution
∆xi ∼ Uniform(−δ, δ),
∀i
(22)
The second method also draws the perturbation from uniform probability distribution, but only one randomly selected decision variable î ∈ n is modified at
a time
∆xi ∼ Uniform(−δ, δ) i = î
(23)
∆xi = 0
i 6= î
The third method is otherwise identical to the second, but the perturbation is
drawn from laplace distribution
∆xi ∼ Laplace(δ),
i = î
(24)
This creates perturbations with mean 0 and variance 2δ 2 .
2.2.4
Parameters
The parameters used in PMOSA algorithm can have significant effect on the
performance of the algorithm. Therefore a set of experiments to find the optimal parameters for PMOSA are conducted before running the actual experiment. The main parameters to select are the temperature T and the method for
7
selecting the neighboring solution for current solution x̃. Decreasing temperatures are commonly used with SA algorithms, but according to [6] decreasing
temperature has not shown to improve PMOSA algorithm much, therefore the
experiments were limited to constant temperatures.
Table 3 — PMOSA algorithm used in this study.
Step
1. Initialize
2. Update current solution
3. Evaluate
4. Selection
5. Update archive
6. Repeat
7. End
Description
Generate random initial current solution candidate.
Generate new solution candidate from the
neighborhood of current solution using eq. 24.
Evaluate the objective function on new solution
candidate L times and determine the performances of new and current solution candidates.
Select either current solution candidate or new
solution candidate as the new solution as described in sec 2.2.1. If current solution was not
update, continue from 2.
If archive is not full add current solution to
archive. Otherwise compute pairwise probabilities of dominance for current solution candidate and each solution in the archive. Select
the member of archive that has the largest sum
of probabilities being dominated by other solutions. If the sum is larger than the corresponding sum of the current solution, replace the solution in the archive with the current solution.
(See section 2.2.2)
Repeat from step 2 until K = Kmax
Return archive A
Table 4 — Parameters for the PMOSA algorithm
Parameter
Number of samples
Number of iterations
Archive size
Perturbation algorithm
Temperature
Value
L = 10
K = 5000
N = 100
Eq. 24 with δ = (xi max − xi min )/10
T =4
8
3
3.1
Test setup
Noise
To evaluate the algorithms in presence of noise normally distributed noise with
zero mean is added to objective function values after the noiseless evaluation.
Standard deviation of noise is given as percentage of |fimax |, where |fimax | is the
maximum of ith objective objective function value in true Pareto front.
The algorithms see only the objective function values with noise. The noiseless values are used to calculate performance metrics thst indicate the efficiency
of the algorithms
3.2
Benchmark problems
The performance of multiobjective optimization algorithms is measured by applying them to different benchmark problems. In literature, there is wide array of different problems with varying properties. In this study three different
benchmark problems are used. FON and KUR are commonly used problems
in the field of multiobjective optimization and DTLZ2B is problem used by the
authors of PMOSA algorithm that is examined in this study.
DTLZ2B is a version of DTLZ family of problems, whose Pareto front is the
unit sphere. DTLZ2B is a version with two dimensional objective space. The
non-convex Pareto front is the arc of the unit circle in region f1 , f2 ∈ [0, 1]. The
main point of interests are the even distribution of found solutions along the
Pareto front and the algorithm’s ability to find the extremum points (f1 , f2 ) =
(0, 1) and (f1 , f2 ) = (1, 0) to avoid false Pareto points along axes.
DTLZ2B = Minimize(f1 , f2 )
5
X
1
(xi − )2
2
i=2
π
f1 (x1 , g) = cos(x )(1 + g)
2
π
f2 (x1 , g) = sin(x )(1 + g)
2
g(x2 , . . . , x5 )
=
0 ≤ xi < 1, ∀i = 1, . . . , 5
(25)
(26)
(27)
(28)
(29)
FON is a problem by C. M. Foneca and P. J. Fleming from 1995. FON
is characterized by non-convex Pareto front and non-linear objective functions
with their values concentrated around the worst possible objective function value
f1 , f2 = (1, 1). Small changes in any of the components in eight dimensional
decision variable lead the solution easily far from the Pareto front, which makes
it challenging for MOEAs to maintain stable evolving population. FON is generally difficult to solve specially in noisy environments. FON is a versatile test
problem, whose Pareto front shape could be continuously modified by exponentiating the objective functions [1].
FON = Minimize(f1 , f2 )
9
(30)
f1 (x1 , . . . , x8 )
f2 (x1 , . . . , x8 )
=
=
8
X
1 − exp[−
1 − exp[−
(31)
i=1
√
(xi − 1/ 8)2 ]
8
X
√
(xi + 1/ 8)2 ]
(32)
i=1
−2 ≤ xi < 2, ∀i = 1, . . . , 8
(33)
KUR is a problem used e.g. in [7]. It has non-convex Pareto front that is
disconnected both in objective and decision variable space.
KUR = Minimize(f1 , f2 )
f1 (x1 , x2 , x3 )
f2 (x1 , x2 , x3 )
=
=
1 − exp[−
1 − exp[−
8
X
(34)
(35)
i=1
√
(xi − 1/ 8)2 ]
8
X
√
(xi + 1/ 8)2 ]
(36)
i=1
−5 ≤ xi < 5, ∀i = 1, 2, 3
(37)
In this study the KUR problem is normalized by scaling the objective functions so that the Pareto front fits approximately the area of unit square. This
way same parameters for the optimization algorithms can be used as with the
other problems.
fˆ1
=
fˆ2
=
1
f1
10
1
f2
10
(38)
(39)
(40)
3.3
Performance metrics
The performance of multiobjective optimization algorithms is measured by four
different statistics in this study. The same metrics that were used by Goh and
Tan are used and they are presented in [7]. All statics are calculated in the
objective space where true Pareto front of the test function P Ftrue is compared
to the Pareto front found by the optimization algorithm P Ff ound . The true
Pareto fronts of the benchmark problems were solved usign MOEA-RF without
noise, with large population, and with large number of generations.
Generational distance (GD) is a proximity indicator. It is the average
euclidean distance between members of P Ftrue and P Fknown
Maximum Spread (MS) is a diversity indicator. It describes how far away
the extremum members of P Fknown from each other.
Spacing (S) is a distribution indicator. It describes how evenly the members
of P Fknown are distributed in the objective space in sense of euclidean distance.
10
Hypervolume Ratio (HVR) is a general quality indicator. It describes how
well the members of P Fknown cover P Ftrue . Hypervolume is a common performance metric in comparative studies in MOEA’s[13] and its improvement is also
used as an optimization heuristic in some MOEAs.
3.4
Experiments
50000 evaluations of objective functions were used in all experiments. With the
evolutionary algorithm this translates to 500 generations with population size
100 and one sample of objective function value per decision variable vector. With
the simulated annealing algorithm each evaluation is repeated 10 times, thus,
5000 candidate solutions are examined. This is because PMOSA has to evaluate
each decision variable more than once to calculate the sample variance of the
objective function value. MOEA-RF on the other hand considers repetition with
same decision variable unnecessary and uses the available number of evaluations
to process more generations. [7]
The three benchmark problems (DTLZ2B, FON and KUR) are solved by
MOEA-RF and PMOSA algorithms with noise levels of 0.1%, 1%, 2%, 5%,
10%, and 20%. Level 0.1% was used instead of 0% because the PMOSA algorithm requires some variance it the data. For each combination the algorithm
is repeated 30 times.
The algorithms are implemented and test are conducted with MATLAB.
The multiobjective optimization functions are presented in appendices B–D.
The functions were instrumented to collect the performance metrics presented
in appendix E.
11
4
4.1
Results
Parameters for PMOSA
To find a good value for the temperature several different constant temperatures
were tested. The experiments were conducted on FON benchmark problem
that was found to be the most challenging of the problems to solve for both
MOEA-RF and PMOSA. The results are presented in figure 1, and based on
the results temperature T = 4 was chosen to be used in the actual experiment.
The selection of temperature was a compromise between best performance on
low levels of noise versus acceptable performance on high levels of noise. Lower
temperatures did provide better convergence with low levels of noise, but higher
temperatures were able to find decent solution with higher probability when the
noise level was high. This suggest that decreasing temperature function could
provide best overall performance in solving FON, but this was not tested in this
study.
Different methods for generating new solutions with different values of δ were
also tested. The results are presented in figure 2. Some of the tested combinations were worse than others but there were many good combinations and the
selection of best alternative was not clear. Perturbation method that chooses
perturbation for one decision variable at a time from Laplace distribution with
δ = (ximax − ximin )/10 was selected because it produced the highest mean of
hypervolume ratio but the difference was not statistically significant compared
to other methods.
Third parameter that could be modified is the number of samples to evaluate
the objective function. In real world simulation problems evaluation of objective
function is the computationally most expensive part in optimization. Therefore
the number of evaluations is the limiting factor in the running time of the algorithm. We assume that samples are IID and the noise is normally distributed, so
when the number of samples grows large the sample mean approaches the noiseless objective function value and variance of sample mean approaches zero. This
means that with enough samples it is possible to solve a problem with any level
of noise. This leads to trade-off between how many times the objective function
is sampled and how many different solution candidates can be processed during
the algorithm.
The same consideration is valid also for MOEA-RF and the number of samples, L = 10 in PMOSA vs. L = 1 in MOEA-RF, is actually one of the main
differences between the algorithms. It can be observed from the examples in
appendix A that the objective function values are much closer to P Ftrue in
PMOSA executions. The ratio between the
√ magnitudes of visible noise is given
by standard error of the mean and it is 10. This is probably the main result
why PMOSA is able to find better solutions than MOEA-RF to problems with
high levels of noise.
Different trials in altering the sample count in both MOEA-RF and PMOSA
would be interesting to conduct but those were not performed in this study.
4.2
Speed of convergence
50000 evaluations was enough for the algorithms to demonstrate their performance with the given parameters and benchmark problems. The fastest conver-
12
gence in metrics was seen with DTLZ2B problem and slowest with FON with
both MOEA-RF and PMOSA (figures in appendix A: 13, 14, 15, 16, 17, 18).
MOEA-RF was able to find the approximate solutions to problems in less
evaluations than PMOSA, but the gap in the metrics reduces with more evaluations. It should also be noted that unlike MOEA-RF, PMOSA algorithm
always returns the requested number of solutions that are more or less dominated by each other. Therefore the number of found solutions is not a valid
metric in comparisons between MOEA-RF and PMOSA as it is to compare
different MOEAs.
From the figures it can be concluded that with low levels of noise MOEARF converges to the correct solution slightly faster than PMOSA. In FON with
high levels of noise the metrics of PMOSA still seem to be growing at 50000
evaluations so increasing the number of evaluations would probably result in
better solutions. To put it other way, if the number of evaluations is limited
MOEA-RF is able to produce result with similar quality faster.
13
Mean of HVR in found PF
1
Noise 0.1%
Noise 1%
Noise 2%
Noise 5%
Noise 10%
Noise 20%
0.9
0.8
Hypervolume Ratio
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.01
0.1
0.5
1
2
4
6
Temperature
8
10
100
1000
Figure 1 — HVR metric of solution found by PMOSA to FON with different temperatures and different levels of noise. Unlike the other experiments, here archive size
N = 50.
14
Mean and standard deviation of HVR in found PF
1
0.9
0.8
Hypervolume Ratio
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Noise 0.1%
Noise 2%
Noise 10%
0
0
0
/5
/5
1/5
/20
/10
/20
/10
1/2
1/1
+1
+1
+ 1 rm +
+1
+1
+1
ce
e+
e+
m
m
m
m
c
c
a
o
r
r
r
r
l
f
i
o
o
p
ifo
ifo
pla
pla
nif
nif
Un
La
Un
La
Un
La
+U
+ U ll + U ne +
e+
e+
e+
e+
e+
All
A
All
O
On
On
On
On
On
orm
nif
Figure 2 — HVR metric of solution found by PMOSA to FON with different algorithms for generating the new solution from the current solution.
“All” = All components of the decision variable were modified at the same time,
“One” = One component of the decision variable was modified at a time,
“Uniform” = The values were chosen from the uniform distribution,
“Laplace” = The values were chosen from Laplace distribution,
fraction is the size of δ in eq. 23 relative to ximax − ximin .
15
Mean HVR of found PF
1
Hypervolume Ratio
0.8
0.6
0.4
0.2
0
0.1 %
EA & FON
SA & FON
EA & DTLZ2B
SA & DTLZ2B
EA & KUR
SA & KUR
1%
2%
5%
10 %
20 %
10 %
20 %
10 %
20 %
Noise level
Mean GD of found PF
0.35
Generational Distance
0.3
0.25
0.2
0.15
0.1
0.05
0
0.1 %
1%
2%
5%
Noise level
Mean size of found PF
100
90
Pareto front size
80
70
60
50
40
30
20
10
0.1 %
1%
2%
5%
Noise level
Figure 3 — Mean values of hypervolume ratio (HVR), generational distance (GD)
and number of solution found by MOEA-RF (EA) and PMOSA (SA) optimization
algorithms to test problems FON, DTLZ2B, and KUR with different levels of noise
with 30 repetitions for each combination.
16
4.3
Quality of solutions
The performance metrics show some differences in response to noise between
MOEA-RF and PMOSA (figure 3). Hypervolume ratio (figure 4) examined
together with spacing metric (figure 5) proved to be to be a good metric to describe the overall quality of the final solution. HVR describes how accurately the
Pareto front was found and S describes how evenly the solutions are distribute
along the Pareto front.
As noted also by Goh and Tan the results of MOEA-RF were slightly better
with low levels of noise compared to test setup with no noise [7]. In PMOSA
algorithm this phenomenon was even stronger. With noise levels less than 2%
both algorithms were able to find the Pareto front of the benchmark problems
equally well, with small exception in PMOSA and FON combination. The
spacing of solutions along the Pareto front was consistently better with PMOSA
than with MOEA-RF. Specially in DTLZ2B the solutions found by MOEA-RF
were concentrated along the middle of the Pareto front as can be seen from
figures 8 and 11 in appendix A.
KUR and DTLZ2B were both easier to solve than FON. With higher levels
of noise both algorithms usually found some solutions but failed to find the
whole Pareto front that is demonstrated by examples of PMOSA algorithm
solving KUR with different levels of noise in figure 12. With KUR problem
the PMOSA algorithm outperformed MOEA-RF with all noise levels. Specially
with 10% and 20% noise PMOSA was still able to find a solutions to KUR and
DTLZ2B with hypervolume ratio near 1, while GA was not. It can be concluded
that with this selection of parameters PMOSA can solve problems with higher
levels of noise than MOEA-RF.
FON proved to be the most difficult of the benchmark problems for both
algorithms to solve. As can be seen in figure 10 the PMOSA algorithm had
some difficulties in escaping the initial solution altogether. However, on average
the results obtained with PMOSA to FON were better that results obtained
with MOEA-RF with high levels of noise. With low levels of noise MOEA-RF
was more accurate. With lower temperature PMOSA would have showed more
similar performance to MOAR-RF, but slightly higher temperature was chosen
to show PMOSA’s ability to handle high levels of noise.
Overall PMOSA produced better quality of solutions compared to MOEARF with all noise levels in KUR and DTLZ2B benchmark problems and with
high noise levels in FON. MOEA-RF was better with FON problem in less noisy
environments.
17
Noise 1 %
1
0.8
0.8
Hypervolume Ratio
Hypervolume Ratio
Noise 0.1 %
1
0.6
0.4
0.6
0.4
0.2
0.2
0
0
B
B
UR
UR
ON
ON
Z2
Z2
TL A & K A & K
& F A & F DTL
S
E
EA
S
&
&D
A
A
E
S
EA
UR
UR
ON Z2B Z2B
ON
& F A & F DTL DTL A & K A & K
E
S
S
&
&
A
A
E
S
Noise 5 %
1
0.8
0.8
Hypervolume Ratio
Hypervolume Ratio
Noise 2 %
1
0.6
0.4
0.2
0.6
0.4
0.2
0
0
B
B
UR
UR
ON
ON
Z2
Z2
TL A & K A & K
& F A & F DTL
S
E
EA
S
&
&D
A
A
E
S
EA
UR
UR
ON Z2B Z2B
ON
& F A & F DTL DTL A & K A & K
E
S
S
&
&
A
A
E
S
Noise 20 %
1
0.8
0.8
Hypervolume Ratio
Hypervolume Ratio
Noise 10 %
1
0.6
0.4
0.2
0.6
0.4
0.2
0
0
B
2B KUR KUR
Z2
LZ
& F A & F DTL
DT EA & SA &
S
EA
&
&
EA
SA
ON
ON
EA
UR
UR
ON Z2B Z2B
ON
& F A & F DTL DTL A & K A & K
E
S
S
&
&
EA
SA
Figure 4 — Hypervolume ratio metric of solutions found by MOEA-RF (EA) and
PMOSA (SA) optimization algorithms to test problems FON, DTLZ2B, and KUR
with different levels of noise with 30 repetitions for each combination. In this box plot
the height of the box and line in the middle describe the upper and lower quartile and
the median of HVR values. The line extending from the box marks the rest of the
data range except for the distant outliers that are drawn as crosses.
18
−4
−4
Noise 0.1 %
x 10
5
5
4
4
Spacing
Spacing
x 10
3
2
1
1
0
0
−3
x 10
EA
UR
UR
ON
ON Z2B Z2B
& F A & F DTL DTL A & K A & K
S
E
S
&
&
EA
SA
−3
Noise 2 %
3
2.5
2.5
2
2
Spacing
Spacing
3
1.5
1
0.5
0.5
0
0
−3
10
x 10
x 10
EA
UR
UR
ON Z2B Z2B
ON
& F A & F DTL DTL A & K A & K
E
S
S
&
&
A
A
E
S
−3
Noise 10 %
10
8
6
6
Spacing
8
4
2
0
0
EA
x 10
Noise 20 %
4
2
B
B
UR
UR
ON
ON
Z2
Z2
TL A & K A & K
& F A & F DTL
EA
S
E
S
&
&D
A
A
E
S
Noise 5 %
1.5
1
B
B
UR
UR
ON
ON
Z2
Z2
TL A & K A & K
& F A & F DTL
S
E
EA
S
&
&D
A
A
E
S
Spacing
3
2
B
2B KUR KUR
ON
ON
Z2
LZ
& F A & F DTL
DT EA & SA &
EA
S
&
&
EA
SA
Noise 1 %
ON FON LZ2B LZ2B KUR KUR
T
T
&
&
&
EA
SA
SA A & D A & D
S
E
&F
Figure 5 — Spacing metric of solutions found by MOEA-RF (EA) and PMOSA (SA)
optimization algorithms to test problems FON, DTLZ2B, and KUR with different
levels of noise with 30 repetitions for each combination. Boxes are drawn as in figure
4
19
PMOSA & FON, noise = 5 %
PMOSA & FON, noise = 5 %
1
f2
f2
1
0.5
0
0.5
0
0
0.5
f1
1
0
PMOSA & FON, noise = 5 %
1
f2
f2
1
PMOSA & FON, noise = 5 %
1
0.5
0
0.5
0
0
0.5
f1
1
0
PMOSA & FON, noise = 5 %
0.5
f1
1
PMOSA & FON, noise = 10 %
1
f2
1
f2
0.5
f1
0.5
0
0.5
0
0
0.5
f1
1
0
0.5
f1
1
Figure 6 — Six examples of PMOSA solution to FON with 5% noise. Found solutions
are drawn as circles, corresponding noiseless objective function values as crosses, and
P Ftrue as continuous line.
20
5
Summary
A reference algorithm MOEA-RF was implemented in this study in order to
compare it with proposed PMOSA algorithm in presence of noise. The performance of implemented MOEA-RF was similar but slightly inferior compared to
the results that were provided by Goh and Tan in [7]. However, the MOEA-RF
algorithm implemented here produced better results than other MOEAs presented by Goh and Tan. Therefore we can conclude that the MOEA-RF that
was implemented is a representative multiobjective evolutionary algorithm that
can be used as a reference point for PMOSA.
It was observed that selecting the optimal parameters for both MOEA and
MOSA requires some work and experimentation and even small changes in algorithm parameters may have clearly observable effect on results. The algorithms
were not separately tuned for the different benchmark problems, but the same
parameters were used for solving all the problems.
For most experiments conducted in this study the PMOSA algorithm performed equally well or better than MOEA-RF algorithm. The only exception
was FON benchmark problem with low levels of noise. This was caused by the
choice of temperature parameter that was tuned towards more noisy problems.
PMOSA by Mattila et al. [6] proved to be very promising algorithm in
solving different kinds of multiobjective optimization problems and further work
on the algorithm is well justified. Introducing adaptive sample count and nonconstant annealing schedule are likely to improve the PMOSA algorithm.
21
References
[1] C. M. Fonseca and P. J. Fleming, “An overview of evolutionary algorithms
in multiobjective optimization,” Evolutionary Computation, vol. 3, pp. 1–
16, 1995.
[2] S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, pp. 671–680, 1983.
[3] B. Suman and P. Kumar, “A survey of simulated annealing as a tool for single and multiobjective optimization,” Journal of the Operational Research
Society, vol. 57, pp. 1143–1160, October 2005.
[4] S. Bandyopadhyay, S. Saha, U. Maulik, and K. Deb, “A simulated
annealing-based multiobjective optimization algorithm: Amosa,” Evolutionary Computation, IEEE Transactions on, vol. 12, no. 3, pp. 269–283,
2008.
[5] K. I. Smith, R. M. Everson, J. E. Fieldsend, C. Murphy, and R. Misra,
“Dominance-based multiobjective simulated annealing,” Evolutionary Computation, IEEE Transactions on, vol. 12, no. 3, pp. 323–342, 2008.
[6] V. Mattila, K. Virtanen, and R. P. Hämäläinen, “Scheduling periodic
maintenance of a fighter aircraft fleet using a multi-objective simulationoptimization approach,” tech. rep., Helsinki University of Technology, Systems analysis laboratory, Jun 2009.
[7] C. K. Goh and K. C. Tan, “An investigation on noisy environments in evolutionary multiobjective optimization,” IEEE Transactions on Evolutionary
Computation, no. 3, pp. 354–381, 2007.
[8] M. H. Alrefaei and S. Andradottir, “A Simulated Annealing Algorithm with
Constant Temperature for Discrete Stochastic Optimization,” MANAGEMENT SCIENCE, vol. 45, no. 5, pp. 748–764, 1999.
[9] K. Bryan, P. Cunningham, and N. Bolshakova, “Application of simulated
annealing to the biclustering of gene expression data,” Information Technology in Biomedicine, IEEE Transactions on, vol. 10, pp. 519–525, July
2006.
[10] R. Marion, R. Michel, and C. Faye, “Atmospheric correction of hyperspectral data over dark surfaces via simulated annealing,” Geoscience and Remote Sensing, IEEE Transactions on, vol. 44, pp. 1566–1574, June 2006.
[11] E. Aggelogiannaki and H. Sarimveis, “A simulated annealing algorithm for
prioritized multiobjective optimization – implementation in an adaptive
model predictive control configuration,” Systems, Man, and Cybernetics,
Part B: Cybernetics, IEEE Transactions on, vol. 37, pp. 902–915, Aug.
2007.
[12] K. Deb, “An introduction to genetic algorithms,” tech. rep., Indian Institute
of Technology Kanpur, Kanpur, India, 1997.
22
[13] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,” IEEE Transactions on Evolutionary
Computation, no. 2, pp. 182–197, 2002.
23
A
Results of all experiments
EA & FON, noise = 0.1 %
EA & FON, noise = 1 %
1
f2
f2
1
0.5
0
0.5
0
0
0.5
f1
1
0
EA & FON, noise = 2 %
1
f2
f2
1
EA & FON, noise = 5 %
1
0.5
0
0.5
0
0
0.5
f1
1
0
EA & FON, noise = 10 %
0.5
f1
1
EA & FON, noise = 20 %
1
f2
1
f2
0.5
f1
0.5
0
0.5
0
0
0.5
f1
1
0
0.5
f1
1
Figure 7 — Examples of MOEA-RF solution to FON with different levels of noise.
Found solutions are drawn as circles, corresponding noiseless objective function values
as crosses, and P Ftrue as continuous line.
24
EA & DTLZ2B, noise = 0.1 %
EA & DTLZ2B, noise = 1 %
1
f2
f2
1
0.5
0
0.5
0
0
0.5
f1
1
0
EA & DTLZ2B, noise = 2 %
1
f2
f2
1
EA & DTLZ2B, noise = 5 %
1
0.5
0
0.5
0
0
0.5
f1
1
0
EA & DTLZ2B, noise = 10 %
0.5
f1
1
EA & DTLZ2B, noise = 20 %
1
f2
1
f2
0.5
f1
0.5
0
0.5
0
0
0.5
f1
1
0
0.5
f1
1
Figure 8 — Examples of MOEA-RF solution to DTLZ2B with different levels of
noise. Symbols as in fig 7.
25
EA & KUR, noise = 0.1 %
EA & KUR, noise = 1 %
−0.5
−0.5
f2
0
f2
0
−1
−1
−2
−1.5
−2
−1.5
f1
f1
EA & KUR, noise = 2 %
EA & KUR, noise = 5 %
−0.5
−0.5
f2
0
f2
0
−1
−1
−2
−1.5
−2
−1.5
f1
f1
EA & KUR, noise = 10 %
EA & KUR, noise = 20 %
−0.5
−0.5
f2
0
f2
0
−1
−1
−2
−1.5
−2
f1
−1.5
f1
Figure 9 — Examples of MOEA-RF solution to KUR with different levels of noise.
Symbols as in fig 7. In reality P Ftrue is disconnect with gaps in places where straight
horizontal line is drawn in in the pictures at f2 = 0 and f2 = −0.45.
26
SA & FON, noise = 0.1 %
SA & FON, noise = 1 %
1
1
0.5
0.5
0
0
0
0.5
1
0
SA & FON, noise = 2 %
1
0.5
0.5
0
0
0.5
1
0
SA & FON, noise = 10 %
1
0.5
0.5
0
0
0.5
0.5
1
SA & FON, noise = 20 %
1
0
1
SA & FON, noise = 5 %
1
0
0.5
1
0
0.5
1
Figure 10 — Examples of PMOSA solution to FON with different levels of noise.
Symbols as in fig 7.
27
SA & DTLZ2B, noise = 0.1 %
SA & DTLZ2B, noise = 1 %
1
1
0.5
0.5
0
0
0
0.5
1
0
SA & DTLZ2B, noise = 2 %
1
0.5
0.5
0
0
0.5
1
0
SA & DTLZ2B, noise = 10 %
1
0.5
0.5
0
0
0.5
0.5
1
SA & DTLZ2B, noise = 20 %
1
0
1
SA & DTLZ2B, noise = 5 %
1
0
0.5
1
0
0.5
1
Figure 11 — Examples of PMOSA solution to DTLZ2B with different levels of noise.
Symbols as in fig 7.
28
SA & KUR, noise = 0.1 %
SA & KUR, noise = 1 %
0
0
−0.5
−0.5
−1
−1
−2
−1.5
−2
SA & KUR, noise = 2 %
SA & KUR, noise = 5 %
0
0
−0.5
−0.5
−1
−1
−2
−1.5
−2
SA & KUR, noise = 10 %
−1.5
SA & KUR, noise = 20 %
0
0
−0.5
−0.5
−1
−1
−2
−1.5
−1.5
−2
−1.5
Figure 12 — Examples of PMOSA solution to KUR with different levels of noise.
Symbols as in fig 7.
29
EA & FON
EA & FON
0.7
1
0.1%
1%
2%
5%
10%
20%
0.6
0.5
0.8
0.6
GD
MS
0.4
0.3
0.4
0.2
0.2
0.1
0
0
1
2
3
evaluations
4
0
5
0
1
4
x 10
EA & FON
2
3
evaluations
4
5
4
x 10
EA & FON
0.018
1
0.016
0.8
0.014
0.012
0.6
S
HVR
0.01
0.008
0.4
0.006
0.004
0.2
0.002
0
0
1
2
3
evaluations
4
0
5
0
1
4
x 10
2
3
evaluations
4
5
4
x 10
Figure 13 — Development of performance metrics over time in MOEA-RF solution
to FON with different levels of noise.
SA & FON
SA & FON
0.7
1
0.1%
1%
2%
5%
10%
20%
0.6
0.5
0.8
0.6
GD
MS
0.4
0.3
0.4
0.2
0.2
0.1
0
0
1
−4
1.4
2
3
evaluations
4
0
5
0
1
4
x 10
SA & FON
x 10
2
3
evaluations
4
5
4
x 10
SA & FON
0.9
0.8
1.2
0.7
1
0.6
S
HVR
0.8
0.6
0.5
0.4
0.3
0.4
0.2
0.2
0
0.1
0
1
2
3
evaluations
4
0
5
4
x 10
0
1
2
3
evaluations
4
5
4
x 10
Figure 14 — Development of performance metrics over time in PMOSA solution to
FON with different levels of noise.
30
EA & DTLZ2B
EA & DTLZ2B
0.35
1.005
0.1%
1%
2%
5%
10%
20%
0.3
0.25
1
0.995
MS
GD
0.2
0.99
0.15
0.985
0.1
0.98
0.05
0
0
1
−3
6
2
3
evaluations
4
5
0.975
0
1
4
x 10
EA & DTLZ2B
x 10
2
3
evaluations
4
5
4
x 10
EA & DTLZ2B
1
0.9
5
0.8
0.7
HVR
S
4
3
0.6
0.5
2
0.4
1
0
0.3
0
1
2
3
evaluations
4
0.2
5
0
1
4
x 10
2
3
evaluations
4
5
4
x 10
Figure 15 — Development of performance metrics over time in MOEA-RF solution
to DTLZ2B with different levels of noise.
SA & DTLZ2B
SA & DTLZ2B
0.4
1
0.1%
1%
2%
5%
10%
20%
0.35
0.3
0.8
0.7
MS
GD
0.25
0.9
0.2
0.6
0.15
0.5
0.1
0.4
0.05
0.3
0
0
1
−3
6
2
3
evaluations
4
0.2
5
0
1
4
x 10
SA & DTLZ2B
x 10
2
3
evaluations
4
5
4
x 10
SA & DTLZ2B
1
5
0.8
4
HVR
S
0.6
3
0.4
2
0.2
1
0
0
1
2
3
evaluations
4
5
4
x 10
0
0
1
2
3
evaluations
4
5
4
x 10
Figure 16 — Development of performance metrics over time in PMOSA solution to
DTLZ2B with different levels of noise.
31
EA & KUR
EA & KUR
0.7
1
0.1%
1%
2%
5%
10%
20%
0.6
0.5
0.9
0.8
GD
MS
0.4
0.7
0.3
0.6
0.2
0.5
0.1
0
0
1
2
3
evaluations
4
0.4
5
0
1
4
x 10
EA & KUR
2
3
evaluations
4
5
4
x 10
EA & KUR
0.035
1
0.9
0.03
0.8
0.025
0.7
S
HVR
0.02
0.015
0.6
0.5
0.4
0.01
0.3
0.005
0
0.2
0
1
2
3
evaluations
4
0.1
5
0
1
4
x 10
2
3
evaluations
4
5
4
x 10
Figure 17 — Development of performance metrics over time in MOEA-RF solution
to KUR with different levels of noise.
SA & KUR
SA & KUR
1.4
1
0.1%
1%
2%
5%
10%
20%
1.2
1
0.9
0.8
MS
GD
0.8
0.7
0.6
0.6
0.4
0.5
0.2
0
0
1
2
3
evaluations
4
0.4
5
0
1
4
x 10
SA & KUR
2
3
evaluations
4
5
4
x 10
SA & KUR
0.014
1
0.012
0.8
0.01
0.6
S
HVR
0.008
0.006
0.4
0.004
0.2
0.002
0
0
1
2
3
evaluations
4
5
4
x 10
0
0
1
2
3
evaluations
4
5
4
x 10
Figure 18 — Development of performance metrics over time in PMOSA solution to
KUR with different levels of noise.
32
B
B.1
Implementation of the benchmark problems
FON.m
function ans = FON(X)
% ans = FON(X)
%
% The FON test problem for multiobjective optimization.
% The row n X(n,:) contains one set of input parameters
% answers (f1,f2) for inputs X(n,1:8) are in ans(n , 1:2).
% Input variable range [-2, 2]
f(:,1) = 1 - exp(-1 * sum((X-(1/sqrt(8))).^2,2));
f(:,2) = 1 - exp(-1 * sum((X+(1/sqrt(8))).^2,2));
B.2
KUR.m
function ans = KUR(X)
% ans = KUR(X)
%
% The KUR test problem for multiobjective optimization.
% The row n X(n,:) contains one set of input parameters
% answers (f1,f2) for inputs X(n,1:3) are in ans(n , 1:2).
% Input variable range [-5, 5]
f(:,1) = (-10 * exp(-0.2 * sqrt(X(:,1).^2 + X(:,2).^2)) - ...
10 * exp(-0.2 * sqrt(X(:,2).^2 + X(:,3).^2))) / 10;
f(:,2) = sum(abs(X).^0.8 + 5 * sin(X.^3),2) / 10;
B.3
DTLZ2B.m
function ans = DTLZ2B(X)
% ans = DTLZ2B(X)
%
% The DTLZ2B test problem for multiobjective optimization.
% The row n X(n,:) contains one set of imput parameters
% answers (f1,f2) for inputs X(n,1:5) are in ans(n , 1:2)
% Input variable range [0, 1]
g = sum((x(:,2:end)-0.5).^2,2);
f= zeros(size(x,1),2);
f(:,1) = cos(x(:,1) .* pi/2) .* (1+g);
f(:,2) = sin(x(:,1) .* pi/2) .* (1+g);
33
C
Implementation of MOEA-RF algorithm
Enhanced Multiobjective Evolutionary Algorithm (MOEA-RF) defined by Goh
and Tan [7].
C.1
moearf.m
function [PF_KNOWN, SOLUTIONS] = moearf(TEST_FUNC, LOWER, UPPER, NOISE)
% [PF_KNOWN, SOLUTIONS] = moearf(TEST_FUNC, LOWER, UPPER, NOISE)
%
% Inputs:
% =======
% TEST_FUNC - function reference to test function like FON
% LOWER
- Lower limits of decision variables. Length of LOWER sets the
%
number of dimensions in decision variables
% UPPER
- Upper limits of decision variables. Size must agree with LOWER
% NOISE
- standard deviation of normally distributed noise
%
% Outputs:
% ========
% PF_FOUND - objective function values of found pareto front
% SOLUTIONS - solutions corresponding rows of PF_FOUND
%---------------------------------% Helper functions
dv_range = max(UPPER) - min(LOWER);
dv_below_zero = -1 * min([0 LOWER]);
gt_bits = 15;
% Convert population matrix from phenotype to genotype
function G = p2g(P)
G = uint32((P + dv_below_zero) ./ dv_range * (2^(gt_bits+1)-1));
end
% Convert population matrix from genotype to phenotype
function P = g2p(G)
P = double(G) ./ (2^(gt_bits+1)-1) .* dv_range - dv_below_zero;
end
% Fix limits. Some optimizations may move phenotypes over genotype rage
function P = enforce_limits(P)
too_low = find(P < repmat(LOWER,size(P,1),1));
too_high = find(P > repmat(UPPER,size(P,1),1));
low = repmat(LOWER,size(P,1),1);
high = repmat(UPPER,size(P,1),1);
P(too_low) = low(too_low);
P(too_high) = high(too_high);
end
%-------------------------------%--------------------------------% Configuration
generations = 500;
p_len = 100;
elite_size = 100;
elite_swap = 10;
mutation_prob = 1/15;
% Noise optimizations on/off
ELDP = 1;
GASS = 1;
34
POSS = 1;
%---------------------------------
%--------------------------------% 1. Initialize
population = unifrnd(repmat(LOWER, p_len, 1), repmat(UPPER,p_len,1));
population_prev = population;
population_prev_prev = population;
pf=[];
pf_solutions=[];
%--------------------------------%--------------------------------% Main Loop
for N = 1:generations
% 2. Evaluate: count noiseless objective function values for population
%
and add noise. Noiseless objective function values are only used to
%
calculate statistics
perf_without_noise = TEST_FUNC(population);
perf = perf_without_noise + normrnd(0, NOISE, size(population,1), 2);
% 2.1 Update Archive. Either use possibilistic archiving model
%
with distance L or pareto ranking
pf = [pf;perf];
pf_solutions = [pf_solutions; population];
if POSS
L = NOISE;
pareto = find_pareto_front_possibilistic(pf,L);
else
pareto = find_pareto_front(pf);
end
pf = pf(pareto,:);
pf_solutions = pf_solutions(pareto,:);
% 2.2 Limit the size of archive by using diversity operator
[pf, pf_solutions] = diversity_operator(pf, pf_solutions, ...
elite_size, 1/p_len);
% 3. Select: Do tournament selection with radomized pairs.
%
Don’t mess up the order of population.
ts_order = randperm(size(perf,1));
winners = tournament_selection(perf,ts_order);
perf = perf(winners,:);
population = population(winners,:);
% 3.1 Replace random members of population with best solutions from
%
the archive
swap = min([elite_swap size(pf_solutions,1) size(population,1)]);
r_pop = randperm(size(population,1));
r_pop = r_pop(1:swap);
r_best = randperm(size(pf_solutions,1));
r_best = r_best(1:swap);
population(r_pop,:) = pf_solutions(r_best,:);
% 4. Crossover
c_order = randperm(size(population,1));
population = crossover(population, c_order, 0.5, @p2g, @g2p);
% 5. Mutate: Either ELDP or plain bitwise
if ELDP
35
% Carry some momentum element-wise from previous generation
d_min = 0.0;
d_max = 0.1;
alpha = 1;
population_prev_prev = population_prev;
population_prev = population;
delta_prev = population_prev - population_prev_prev;
special_mutate = alpha * abs(delta_prev) > d_min & ...
alpha * abs(delta_prev) < d_max;
population = mutate(population, mutation_prob, @p2g, @g2p);
population(special_mutate) = population_prev(special_mutate) + ...
alpha * delta_prev(special_mutate);
population = enforce_limits(population);
else
population = mutate(population, mutation_prob, @p2g, @g2p);
end
if GASS & N > 150
% Gene adaptation to either diverge or converge solutions if necessary
c_limit = 0.6;
d_limit = 0.6;
prob = 1/8;
w = 0.1;
lowbd = min(pf_solutions);
uppbd = max(pf_solutions);
meanbd = mean(pf_solutions);
values_to_change = find(unifrnd(0,1,size(population)) < prob);
for K = 1: size(population,2)
c_values(:,K) = unifrnd(lowbd(K) - w * abs(meanbd(K)), ...
uppbd(K) + w * abs(meanbd(K)), ...
size(population,1), 1);
d_values(:,K) = unifrnd(meanbd(K) - w * abs(meanbd(K)), ...
meanbd(K) + w * abs(meanbd(K)), ...
size(population,1), 1);
end
if size(pf,1) > c_limit * elite_size
population(values_to_change) = c_values(values_to_change);
elseif size(pf,1) < d_limit * elite_size
population(values_to_change) = d_values(values_to_change);
end
population = enforce_limits(population);
end
end
%---------------------------PF_KNOWN = pf;
SOLUTIONS = pf_solutions;
end
C.2
crossover.m
function new_population = crossover(old_population, order, prob, p2g, g2p)
%
% Uniform crossover for MOEA-RF
M = size(old_population,1);
cross_prob = rand(M,1) > prob;
crossover_mask = random_bit_matrix([M, size(old_population,2)], 0.5);
crossover_mask(cross_prob,:) = 0;
bit_population = p2g(old_population);
new_bit_population = zeros(size(bit_population),’uint32’);;
36
for I = 1:2:M-1
x = order([I I+1]);
new_bit_population(x(1),:) = bitand(bit_population(x(1),:), ...
crossover_mask(x(1),:)) + ...
bitand(bit_population(x(2),:), ...
bitcmp(crossover_mask(x(1),:)));
new_bit_population(x(2),:) = bitand(bit_population(x(2),:), ...
crossover_mask(x(1),:)) + ...
bitand(bit_population(x(1),:), ...
bitcmp(crossover_mask(x(1),:)));
end
new_population = g2p(new_bit_population);
end
C.3
mutate.m
function new_population = mutate(old_population, prob, p2g, g2p)
%
% Bitwise mutation for MOEA-RF
% Inputs:
% prob - Probability of mutation for each bit
% p2g - function to convert population phenotype to genotype
% g2p - function to convert population genotype to phenotype
% old_population - Population before mutation in phenotype space
% Output:
% new_population - Population after mutation in phenotype space
bit_population = p2g(old_population);
new_bit_population = bitxor(bit_population, ...
random_bit_matrix(size(bit_population),prob));
new_population = g2p(new_bit_population);
C.4
tournament selection.m
function winners = tournament_selection(perf, pairing)
%
% Tournament selection based on pareto ranking of given matrix of
% performace measures and paring. Row indices of winners are returned.
function ans = x_is_non_dominated(perf_x, perf_y)
ans = all(all(perf_x <= perf_y));
end
M = size(perf,1);
winners = [];
for I = 1:2:M-1
o = pairing([I I+1]);
if x_is_non_dominated(perf(o(1),:), perf(o(2),:))
winners([end+1 end+2]) = [o(1) o(1)];
else
winners([end+1 end+2]) = [o(2) o(2)];
end
end
end
C.5
diversity operator.m
function [PF, solutions] = diversity_operator(PF, solutions, max_count, ...
niche_radius)
% [PF, solutions] = diversity_operator(PF, solutions, max_count, niche_radius)
37
%
% Diversity operator for MOEA-RF. This function performs recurrent
% truncation operation on "PF" so that "max_count" members are left
% in PF and corresponding solutions
while size(PF,1) > max_count
niche_count = sum((squareform(pdist(PF)) + eye(size(PF,1)) * 10 ) < ...
niche_radius);
most_crowded = find(niche_count == max(niche_count));
to_remove = most_crowded(randperm(length(most_crowded)));
to_remove = to_remove(1);
PF(to_remove,:) = [];
solutions(to_remove,:) = [];
end
C.6
find pareto front possibilistic.m
function winners = find_pareto_front_possibilistic(perf, L)
%
% Necessity possiblistic pareto front finding algorithm for MOEA-RF
% Returns the row indices of given objective function
% values that belong to the Pareto set.
%
% NOTE: This function works only for two dimensional objective functions.
len = size(perf,1);
dominated = zeros(len,1);
mask = ones(len,1);
I = 1;
while I < len
mask(I) = 0;
dominated = dominated | ((perf(:,1) - unifrnd(0,L,len,1) >= perf(I,1)) & ...
(perf(:,2) - unifrnd(0,L,len,1) >= perf(I,2))) & mask;
index_not_dominated = (find(not(dominated)));
mask(I) = 1;
I=min([index_not_dominated(index_not_dominated > I) ;len]);
end
winners = find(not(dominated));
C.7
random bit matrix.m
function result = random_bit_matrix(s,prob)
bits = uint32(rand(s(1), s(2), 15) < prob);
result = zeros(s,’uint32’);
for I = 1:15
result = result + bits(:,:,I).*2^(I-1);
end
end
38
D
Implementation of PMOSA algorithm
Probabilistic Multiobjective Simulated Annealing Algorithm (PMOSA) defined
and originally implemented by Mattila et.al [6]. This algorithm is modified to
accept different benchmark problems.
D.1
sa2ba.m
function [S,P,fo] = sa2ba(noise, rstate,r,M,T,gamma, TEST_FUNC, LOWER, UPPER)
%Simulated annealing for test problem dtlz2 with uncertainty
%
% PMOSA algorithm - additive version
% a maximum of gamma solutions are included in the non-dominated set
%Initialization------------------------d=size(LOWER,2);
%
%
l=LOWER;
%
u=UPPER;
%
V=[noise^2 noise^2];
%
delta=(max(UPPER) - min(LOWER))/10;
%
%
%r=10;
%
%T=2;
%
%gamma=3;
%
%
%
less than gamma
fo=[];
fout=0;
cout=0;
%Initialization-end---------------------
Dimension of the vector of decision
variables
Lower bounds of decision variables
Upper bounds
Variances of objective function values
Amount of change in generating
new solutions
Number of samples to evaluate objectives
Temperature
Insert into non-dominated set, if
probability of being dominated is
%Initial solution----------------------k=1;
%rand(’state’,rstate);
x=unifrnd(l,u);
y=x;
fi=TEST_FUNC(x);
%
%
%
%
Random state for initial solution
Initial solution
Store
Actual objective function values
e1=normrnd(0,sqrt(V(1)),1,r);
e2=normrnd(0,sqrt(V(2)),1,r);
% Uncertainties
f=[mean(fi(1)+e1) mean(fi(2)+e2)];
v=[var(fi(1)+e1)/r var(fi(2)+e2)/r];
% estimated ofvs
% variance estimates of means
fx=f;
vx=v;
% Store
px=0;
py=px;
% Probability that x is dominated by
%
the current S
% Store
p=1;
P=0;
% Acceptance probability
% Mutual probabilities of domination
acc=1;
ins=1;
% Accept as current solution
% Insert into non-dominated set
S=[x fx vx];
Sn=1;
% Currently selected non-dominated set
% Size of S
39
%Initial solution-end------------------%Iteratation---------------------------while k<M
k=k+1;
%Generate new candidate solution
y=gen(x,delta,l,u);
%Evaluate
fi=TEST_FUNC(y);
e1=normrnd(0,sqrt(V(1)),1,r);
e2=normrnd(0,sqrt(V(2)),1,r);
f=[mean(fi(1)+e1) mean(fi(2)+e2)];
v=[var(fi(1)+e1)/r var(fi(2)+e2)/r];
I=find(ismember(S(:,1:d),x,’rows’));
% Actual objective function values
% Uncertainties
% estimated ofvs
% variance estimates of means
% Position of x in S
%Determine energy of x as probability of being dominated by S and y
if isempty(I)
% x is not in S
px=0;
for i=1:size(S,1)
px=px+pdom(S(i,d+1:d+2),S(i,d+3:d+4),fx,vx);
end
else
% x is in S
px=sum(P(I,:));
end
px=px+pdom(f,v,fx,vx);
% Account for y
%Determine energy of y as probability of being dominated by S and x
py=0;
for i=1:size(S,1)
py=py+pdom(S(i,d+1:d+2),S(i,d+3:d+4),f,v);
end
%Insertion into non-dominated set
if Sn<gamma || py<max(sum(P,2))
ins=1;
else
ins=0;
end
%Remove the most dominated element if the size of S exceeds gamma
if ins==1 && Sn==gamma
[z,I2]=max(sum(P,2));
I3=[1:I2-1 I2+1:Sn];
S=S(I3,:);
P=P(I3,I3);
Sn=Sn-1;
end
%Update probabilities of being dominated within S
if ins==1;
S(end+1,:)=[y f v];
Sn=Sn+1;
P(Sn,Sn)=0;
%Probability that y dominates y
for i=1:Sn-1
%Probability that y dominates ith member of S
P(i,Sn)=pdom(S(Sn,d+1:d+2),S(Sn,d+3:d+4),S(i,d+1:d+2),S(i,d+3:d+4));
%Probability that ith member of S dominates i
P(Sn,i)=pdom(S(i,d+1:d+2),S(i,d+3:d+4),S(Sn,d+1:d+2),S(Sn,d+3:d+4));
40
end
end
%Acceptance of y
if isempty(I)
py=py+pdom(fx,vx,f,v);
end
%x is not in S
%Account for x
acc=0;
if py<=px
acc=1;
p=1;
else
p=exp(-(py-px)/T);
if rand<=p
acc=1;
end
end
%If accepted, set current solution,
if acc==1
x=y;
fx=f;
vx=v;
end
%Command line output
if cout
fprintf(’k: %2d
f1: %4.3f
f2: %4.3f
px: %4.3f
py: %4.3f’ ...
’p: %4.3f acc: %1d
ins: %1d
#S: %2d \n’, ...
k,f(1),f(2),px,py,p,acc,ins,size(S,1));
end
%Save values
if fout
fo(end+1,:)=[px py];
end
end
%Iteratation end------------------------
D.2
gen.m
function f=gen(x,d,l,u)
%
%
%
%
Generate new solution from x
l
lower bounds
u
upper bounds
Laplace distribution - one element perturbed
k=unidrnd(length(x));
x(k)=x(k)+laprnd(d);
f=x;
%Restore feasibility
if f(k)<l(k)
f(k)=2*l(k)-f(k);
elseif f(k)>u(k)
f(k)=2*u(k)-f(k);
end
41
D.3
pdom.m
function f=pdom(x1,s1,x2,s2)
%Probability that design 1 dominates design 2 in context of minimization
%x1,x2 vector of estimates for epected objective function values
%s1,s2 vector of variance estimates
%Obervations of ofvs assumed normally distributed
f=prod(normcdf(0,x1-x2,sqrt(s1+s2)));
D.4
laprnd.m
function x=laprnd(b)
% Generates Laplace distributed random variates with
% location parameter equal to 0
% scale parameter equal to b
[m,n]=size(b);
u=unifrnd(-1/2,1/2,m,n);
x=-b.*sign(u).*log(1-2*abs(u));
42
E
Implementation of metrics
Performance metrics defined by Goh and Tan [7].
E.1
stats.m
function [GD, MS, S, HVR] = stats(pf_known, pf_true)
%
% Scalar performace metrics GD, MS, S, and HVR between two
% sets of two vectors of two dimensional objective function values.
% GD - Generational Distance
klen=size(pf_known,1);
tlen=size(pf_true,1);
fcount=size(pf_true,2);
gdsum = 0;
for I = 1:klen
gdsum = gdsum + min(sum( (pf_true - repmat(pf_known(I,:),tlen,1)).^2,2));
end
GD = sqrt(gdsum / klen);
% MS - Maximum Spread
mssum = 0;
for I = 1:fcount
mssum = mssum + ((min(max(pf_known(:,I)), max(pf_true(:,I))) - ...
max(min(pf_known(:,I)), min(pf_true(:,I)))) / ...
(max(pf_true(:,I)) - min(pf_true(:,I))))^2;
end
MS = sqrt(mssum / fcount);
% S - Spacing
distances = squareform(pdist(pf_known));
distances = distances + eye(size(distances)) * 10;
min_distances = min(distances);
d_avg = mean(min_distances);
ssum = sum((min_distances-d_avg).^2);
S = sqrt(ssum / klen) / klen;
% HVR - Hypervolume Ratio
% Volume objective space that is dominated by pareto front.
% Calculated by creating a grid of zeros that fills the objective space
% and then going through the pareto fronts and setting all grid points to 1
% that are dominated by pareto points
if size(pf_known,2) == 2
grid_points = combvec(min(pf_true(:,1)):0.01:max(pf_true(:,1)), ...
min(pf_true(:,2)):0.01:max(pf_true(:,2)))’;
grid_known = zeros(size(grid_points));
grid_true = zeros(size(grid_points));
for I = 1 : size(pf_known,1);
grid_known(find(all([grid_points(:,1) >= pf_known(I,1), ...
grid_points(:,2) >= pf_known(I,2)],2)),:) = 1;
end
for I = 1 : size(pf_true,1);
grid_true(find(all([grid_points(:,1) >= pf_true(I,1), ...
grid_points(:,2) >= pf_true(I,2)],2)),:) = 1;
end
HV_known = sum(sum(grid_known));
HV_true = sum(sum(grid_true));
HVR = HV_known / HV_true;
else
HVR = 0;
end
end
43