# GECCO '18- Proceedings of the Genetic and Evolutionary Computation Conference

Full Citation in the ACM Digital Library## SESSION: Theory

### A tight runtime analysis for the (μ + λ) EA

Despite significant progress in the theory of evolutionary algorithms, the theoretical understanding of true population-based evolutionary algorithms remains challenging and only few rigorous results exist. Already for the most basic problem, the determination of the asymptotic runtime of the (μ + λ) evolutionary algorithm on the simple OneMax benchmark function, only the special cases μ = 1 and λ = 1 have been solved.

In this work, we analyze this long-standing problem and show the asymptotically tight result that the runtime *T*, the number of iterations until the optimum is found, satisfies

[EQUATION]

where log^{+} *x* := max{1, log *x*} for all *x* > 0.

### A new analysis method for evolutionary optimization of dynamic and noisy objective functions

Evolutionary algorithms, being problem-independent and randomized heuristics, are generally believed to be robust to dynamic changes and noisy access to the problem instance. We propose a new method to obtain rigorous runtime results for such settings. In contrast to many previous works, our new approach mostly relies on general parameters of the dynamics or the noise models, such as the expected change of the dynamic optimum or the probability to have a dynamic change in one iteration. Consequently, we obtain bounds which are valid for large varieties of such models. Despite this generality, for almost all particular models regarded in the past our bounds are stronger than those given in previous works. As one particular result, we prove that the (1 + λ) EA can optimize the OneMax benchmark function efficiently despite a constant rate of 1-bit flip noise. For this, a logarithmic size offspring population suffices (the previous-best result required a super-linear value of λ). Our results suggest that the typical way to find the optimum in such adverse settings is not via a steady approach of the optimum, but rather via an exceptionally fast approach after waiting for a rare phase of low dynamic changes or noise.

### Runtime analysis for self-adaptive mutation rates

We propose and analyze a self-adaptive version of the (1, λ) evolutionary algorithm in which the current mutation rate is part of the individual and thus also subject to mutation. A rigorous runtime analysis on the O*O*(*n*λ/log λ + *n* log *n*). This time is asymptotically smaller than the optimization time of the classic (1, λ) EA and (1 + λ) EA for all static mutation rates and best possible among all λ-parallel mutation-based unbiased black-box algorithms.

Our result shows that self-adaptation in evolutionary computation can find complex optimal parameter settings on the fly. At the same time, it proves that a relatively complicated self-adjusting scheme for the mutation rate proposed by Doerr et al. (GECCO 2017) can be replaced by our simple endogenous scheme. Moreover, the paper contributes new tools for the analysis of the two-dimensional drift processes arising in self-adaptive EAs, including bounds on occupation probabilities in processes with non-constant drift.

### Significance-based estimation-of-distribution algorithms

*Estimation-of-distribution algorithms* (EDAs) are randomized search heuristics that maintain a stochastic model of the solution space. This model is updated from iteration to iteration based on the quality of the solutions sampled according to the model. As previous works show, this short-term perspective can lead to erratic updates of the model, in particular, to bit-frequencies approaching a random boundary value. This can lead to significant performance losses.

In order to overcome this problem, we propose a new EDA that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this *significance-based compact genetic algorithm* (sig-cGA) optimizes the common benchmark functions O*O*(*n* log *n*) time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed scGA - an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model - we prove that it optimizes O*ρ*.

### The linear hidden subset problem for the (1 + 1) EA with scheduled and adaptive mutation rates

We study unbiased (1 + 1) evolutionary algorithms on linear functions with an unknown number *n* of bits with non-zero weight. Static algorithms achieve an optimal runtime of *O*(*n*(ln *n*)^{2+ε)}, however, it remained unclear whether more dynamic parameter policies could yield better runtime guarantees. We consider two setups: one where the mutation rate follows a fixed schedule, and one where it may be adapted depending on the history of the run. For the first setup, we give a schedule that achieves a runtime of (1±*o*(1))*βn* ln *n*, where β ≈ 3.552, which is an asymptotic improvement over the runtime of the static setup. Moreover, we show that no schedule admits a better runtime guarantee and that the optimal schedule is essentially unique. For the second setup, we show that the runtime can be further improved to (1 ± *o*(1))*en* ln *n*, which matches the performance of algorithms that know *n* in advance.

Finally, we study the related model of initial segment uncertainty with static position-dependent mutation rates, and derive asymptotically optimal lower bounds. This answers a question by Doerr, Doerr, and Kötzing.

### Medium step sizes are harmful for the compact genetic algorithm

We study the intricate dynamics of the Compact Genetic Algorithm (cGA) on O*K*, that determines how quickly decisions about promising bit values are fixed in the probabilistic model. It is known that cGA and UMDA, a related algorithm, run in expected time *O*(*n* log *n*) when the step size is just small enough [EQUATION] to avoid wrong decisions being fixed. UMDA also shows the same performance in a very different regime (equivalent to *K* = Θ(log *n*) in the cGA) with much larger steps sizes, but for very different reasons: many wrong decisions are fixed initially, but then reverted efficiently.

We show that step sizes in between these two optimal regimes are harmful as they yield larger runtimes: we prove a lower bound Ω(*K*^{1/3}*n*+*n* log *n*) for the cGA on O*K* = Ω(log^{3} *n*) the runtime increases with growing *K* before dropping again to [EQUATION] for [EQUATION]. This suggests that the expected runtime for cGA is a bimodal function in *K* with two very different optimal regions and worse performance in between.

### Analysis of noisy evolutionary optimization when sampling fails

In noisy evolutionary optimization, sampling is a common strategy to deal with noise, which evaluates the fitness of a solution multiple times (called *sample size*) independently and then uses the average to approximate the true fitness. Previous studies mainly focused on the empirical design of efficient sampling strategies, and the few theoretical analyses mainly proved the effectiveness of sampling with a fixed sample size in some situations. There are many fundamental theoretical issues to be addressed. In this paper, we first investigate the effect of sample size. By analyzing the (1+1)-EA on noisy LeadingOnes, we show that as the sample size increases, the running time can reduce from exponential to polynomial, but then return to exponential. This discloses that a proper sample size is crucial in practice. Then, we investigate what other strategies can work when sampling with any fixed sample size fails. By two illustrative examples, we prove that using parent populations can be better, and if using parent populations is also ineffective, adaptive sampling (i.e., sampling with an adaptive sample size) can work.

### Runtime analysis of randomized search heuristics for the dynamic weighted vertex cover problem

Randomized search heuristics such as evolutionary algorithms are frequently applied to dynamic combinatorial optimization problems. Within this paper, we present a dynamic model of the classic Weighted Vertex Cover problem and analyze the performances of the two well-studied algorithms Randomized Local Search and (1+1) EA adapted to it, to contribute to the theoretical understanding of evolutionary computing for problems with dynamic changes. In our investigations, we use an edge-based representation based on the dual formulation of the problem and study the expected runtimes that the two algorithms require to maintain a 2-approximate solution when the given weighted graph is modified by an edge-editing or weight-editing operation. Considering the weights on the vertices may be exponentially large with respect to the size of the graph, the step size adaption strategy is incorporated. Our results show that both algorithms can recompute 2-approximate solutions for the studied dynamic changes efficiently

### On the robustness of evolutionary algorithms to noise: refined results and an example where noise helps

We present refined results for the expected optimisation time of the (1+1) EA and the (1+λ) EA on L*p*. Previous work showed that the (1+1) EA runs in polynomial time if *p* = *O*((log *n*)/n^{2}) and needs superpolynomial time if *p* = *Ω*((log *n*)/*n*), leaving a huge gap for which no results were known. We close this gap by showing that the expected optimisation time is Θ(*n*^{2}) · exp(Θ(*pn*^{2})), allowing for the first time to locate the threshold between polynomial and superpolynomial expected times at *p* = Θ((log *n*)/*n*^{2}). Hence the (1+1) EA on L*n* can effectively deal with much higher noise than known before.

Finally, we present an example of a rugged landscape where prior noise can help to escape from local optima by blurring the landscape and allowing a hill climber to see the underlying gradient.

### Crossover can simulate bounded tree search on a fixed-parameter tractable optimization problem

We investigate the effect of crossover in the context of parameterized complexity on a well-known fixed-parameter tractable combinatorial optimization problem known as the *closest string problem.* We prove that a multi-start (μ+1) solves arbitrary length-*n* instances of closest string in 2^{O(d2+d log k}). poly(*n*) steps in expectation. Here, *k* is the number of strings in the input set, and *d* is the value of the optimal solution. This confirms that the multi-start (μ+1) runs in randomized *fixed-parameter tractable* (FPT) time with respect to the above parameterization. On the other hand, if the crossover operation is disabled, we show there exist instances that require *n*^{Ω(log(d+k)} steps in expectation. The lower bound asserts that crossover is a necessary component in the FPT running time.

### Domino convergence: why one should hill-climb on linear functions

In the theory community of evolutionary computation, linear pseudo-boolean functions are often regarded as easy problems since all of them can be optimized in expected time *O*(*n* log *n*) by simple unbiased algorithms. However, results from genetic algorithms and estimation-of-distribution algorithms indicate that these algorithms treat different linear functions differently. More precisely, an effect called "domino convergence" is described in the literature, which means that bits of large weight in the linear function are optimized earlier than bits of low weight. Hence, different linear functions may lead to rather different expected optimization times.

The present paper conducts a study of domino convergence. By rigorous runtime analyses, it is shown that domino convergence is mostly a consequence of the crossover underlying genetic algorithms and EDAs. Here a performance gap of order Ω(*n*/log *n*) between different linear functions is proved. In simple mutation-only EAs the effect of domino convergence is much less pronounced, with the typical performance gap being logarithmic in the population size. The effect disappears when population size 1 is used and the algorithm is reduced to hillclimbing. Different selection mechanisms, including cut and tournament selection are investigated and their impact on domino convergence is analyzed.