# GECCO '18- Proceedings of the Genetic and Evolutionary Computation Conference

Full Citation in the ACM Digital Library## SESSION: Evolutionary numerical optimization

### Drift theory in continuous search spaces: expected hitting time of the (1 + 1)-ES with 1/5 success rule

This paper explores the use of the standard approach for proving runtime bounds in discrete domains---often referred to as drift analysis---in the context of optimization on a continuous domain. Using this framework we analyze the (1+1) Evolution Strategy with one-fifth success rule on the sphere function. To deal with potential functions that are not lower-bounded, we formulate novel drift theorems. We then use the theorems to prove bounds on the expected hitting time to reach a certain target fitness in finite dimension *d.* The bounds are akin to linear convergence. We then study the dependency of the different terms on *d* proving a convergence rate dependency of Θ(1/*d*). Our results constitute the first non-asymptotic analysis for the algorithm considered as well as the first explicit application of drift analysis to a randomized search heuristic with continuous domain.

### Analysis of evolution strategies with the optimal weighted recombination

This paper studies the performance for evolution strategies with the optimal weighed recombination on spherical problems in finite dimensions. We first discuss the different forms of functions that are used to derive the optimal recombination weights and step size, and then derive an inequality that establishes the relationship between these functions. We prove that using the expectation of random variables to derive the optimal recombination weights and step size can be disappointing in terms of the expected performance of evolution strategies. We show that using the realizations of random variables is a better choice. We generalize the results to any convex functions and establish an inequality for the normalized quality gain. We prove that the normalized quality gain of the evolution strategies have a better and robust performance when they use the optimal recombination weights and the optimal step size that are derived from the realizations of random variables rather than using the expectations of random variables.

### An empirical comparison of metamodeling strategies in noisy environments

Metamodeling plays an important role in simulation-based optimization by providing computationally inexpensive approximations for the objective and constraint functions. Additionally metamodeling can also serve to filter noise, which is inherent in many simulation problems causing optimization algorithms to be mislead. In this paper, we conduct a thorough statistical comparison of four popular metamodeling methods with respect to their approximation accuracy at various levels of noise. We use six scalable benchmark problems from the optimization literature as our test suite. The problems have been chosen to represent different types of fitness landscapes, namely, bowl-shaped, valley-shaped, steep ridges and multi-modal, all of which can significantly influence the impact of noise. Each metamodeling technique is used in combination with four different noise handling techniques that are commonly employed by practitioners in the field of simulation-based optimization. The goal is to identify the *metamodeling strategy*, i.e. a combination of metamodeling and noise handling, that performs significantly better than others on the fitness landscapes under consideration. We also demonstrate how these results carry over to a simulation-based optimization problem concerning a scalable discrete event model of a simple but realistic production line.

### Learning-based topology variation in evolutionary level set topology optimization

The main goal in structural Topology Optimization is to find an optimal distribution of material within a defined design domain, under specified boundary conditions. This task is frequently solved with gradient-based methods, but for some problems, e.g. in the domain of crash Topology Optimization, analytical sensitivity information is not available. The recent Evolutionary Level Set Method (EA-LSM) uses Evolutionary Strategies and a representation based on geometric Level Set Functions to solve such problems. However, computational costs associated with Evolutionary Algorithms are relatively high and grow significantly with rising dimensionality of the optimization problem. In this paper, we propose an improved version of EA-LSM, exploiting an adaptive representation, where the number of structural components increases during the optimization. We employ a learning-based approach, where a pre-trained neural network model predicts favorable topological changes, based on the structural state of the design. The proposed algorithm converges quickly at the beginning, determining good designs in low-dimensional search spaces, and the representation is gradually extended by increasing structural complexity. The approach is evaluated on a standard minimum compliance design problem and its superiority with respect to a random adaptive method is demonstrated.

### A global information based adaptive threshold for grouping large scale optimization problems

By taking the idea of divide-and-conquer, cooperative coevolution (CC) provides a powerful architecture for large scale global optimization (LSGO) problems, but its efficiency highly relies on the decomposition strategy. It has been shown that differential grouping (DG) performs well on decomposing LSGO problems by effectively detecting the interaction among decision variables. However, its decomposition accuracy highly depends on the threshold. To improve the decomposition accuracy of DG, a global information based adaptive threshold setting algorithm (GIAT) is proposed in this paper. On the one hand, by reducing the sensitivities of the indicator in DG to the roundoff error and the magnitude of contribution weight of subcomponent, we proposed a new indicator for two variables which is much more sensitive to their interaction. On the other hand, instead of setting the threshold only based on one pair of variables, the threshold is generated from the interaction information for all pair of variables. By conducting the experiments on two sets of LSGO benchmark functions, the correctness and robustness of this new indicator and GIAT were verified.

### Inheritance-based diversity measures for explicit convergence control in evolutionary algorithms

Diversity is an important factor in evolutionary algorithms to prevent premature convergence towards a single local optimum. In order to maintain diversity throughout the process of evolution, various means exist in literature. We analyze approaches to diversity that (a) have an explicit and quantifiable influence on fitness at the individual level and (b) require no (or very little) additional domain knowledge such as domain-specific distance functions. We also introduce the concept of genealogical diversity in a broader study. We show that employing these approaches can help evolutionary algorithms for global optimization in many cases.

### Expanding variational autoencoders for learning and exploiting latent representations in search distributions

In the past, evolutionary algorithms (EAs) that use probabilistic modeling of the best solutions incorporated latent or hidden variables to the models as a more accurate way to represent the search distributions. Recently, a number of neural-network models that compute approximations of posterior (latent variable) distributions have been introduced. In this paper, we investigate the use of the variational autoencoder (VAE), a class of neural-network based generative model, for modeling and sampling search distributions as part of an estimation of distribution algorithm. We show that VAE can capture dependencies between decision variables and objectives. This feature is proven to improve the sampling capacity of model based EAs. Furthermore, we extend the original VAE model by adding a new, fitness-approximating network component. We show that it is possible to adapt the architecture of these models and we present evidence of how to extend VAEs to better fulfill the requirements of probabilistic modeling in EAs. While our results are not yet competitive with state of the art probabilistic-based optimizers, they represent a promising direction for the application of generative models within EDAs.

### Real-valued evolutionary multi-modal optimization driven by hill-valley clustering

Model-based evolutionary algorithms (EAs) adapt an underlying search model to features of the problem at hand, such as the linkage between problem variables. The performance of EAs often deteriorates as multiple modes in the fitness landscape are modelled with a unimodal search model. The number of modes is however often unknown a priori, especially in a black-box setting, which complicates adaptation of the search model. In this work, we focus on models that can adapt to the multi-modality of the fitness landscape. Specifically, we introduce Hill-Valley Clustering, a remarkably simple approach to adaptively cluster the search space in niches, such that a single mode resides in each niche. In each of the located niches, a core search algorithm is initialized to optimize that niche. Combined with an EA and a restart scheme, the resulting Hill-Valley EA (HillVallEA) is compared to current state-of-the-art niching methods on a standard benchmark suite for multi-modal optimization. Numerical results in terms of the detected number of global optima show that, in spite of its simplicity, HillVallEA is competitive within the limited budget of the benchmark suite, and shows superior performance in the long run.

### PSA-CMA-ES: CMA-ES with population size adaptation

The population size, i.e., the number of candidate solutions generated at each iteration, is the most critical strategy parameter in the covariance matrix adaptation evolution strategy, CMA-ES, which is one of the state-of-the-art search algorithms for black-box continuous optimization. The population size is required to be larger than its default value when the objective function is well-structured multimodal and/or noisy, while we want to keep it as small as possible for optimization speed. However, the strategy parameter tuning based on trial and error is, in general, prohibitively expensive in black-box optimization scenario. This paper proposes a novel strategy to adapt the population size for CMA-ES. The population size is adapted based on the estimated accuracy of the update of the normal distribution parameters. The CMA-ES with the proposed population size adaptation mechanism, PSA-CMA-ES, is tested both on noiseless and noisy benchmark functions, and compared with existing strategies. The results revealed that the PSA-CMA-ES works well on well-structured multimodal and/or noisy functions, but causes inefficient increase of the population size on unimodal functions. Furthermore, it is shown that the PSA-CMA-ES can tackle noise and multimodality at the same time.

### Performance improvements for evolutionary strategy-based one-class constraint synthesis

Mathematical Programming (MP) models are common in optimization of systems. Designing those models, however, is challenging for human experts facing deficiencies in domain knowledge, rigorous technical requirements for the model (e.g., linearity) or lack of experience. Evolutionary Strategy-based One-Class Constraint Synthesis (ESOCCS) is a recently proposed method for computer-aided modeling, aimed at reduction of the burden on the expert by acquiring the MP constraints from historical data and letting the expert to freely modify them, supplement with an objective function and optimize using an off-the-shelf solver. In this study, we extend ESOCCS with five improvements aimed at increasing its performance in typical problems. Three of them turn out beneficial in a rigorous experimental evaluation and one prevents ESOCCS from producing degenerate models.

### A novel similarity-based mutant vector generation strategy for differential evolution

The mutant vector generation strategy is an essential component of Differential Evolution (

### Adaptive threshold parameter estimation with recursive differential grouping for problem decomposition

Problem decomposition plays an essential role in the success of cooperative co-evolution (CC), when used for solving large-scale optimization problems. The recently proposed *recursive differential grouping* (RDG) method has been shown to be very efficient, especially in terms of time complexity. However, it requires an appropriate parameter setting to estimate a threshold value in order to determine if two subsets of decision variables interact or not. Furthermore, using one global threshold value may be insufficient to identify variable interactions in components with different contribution to the fitness value. Inspired by the *different grouping 2* (DG2) method, in this paper, we adaptively estimates a threshold value based on computational round-off errors for RDG. We derive an upper bound of the round-off errors, which is shown to be sufficient when identifying variable interactions across a wide range of large-scale benchmark problems. Comprehensive numerical experimental results showed that the proposed RDG2 method achieved higher decomposition accuracy than RDG and DG2. When embedded into a CC framework, it achieved statistically equal or significantly better solution quality than RDG and DG2, when used to solve the benchmark problems.

### Analysis of information geometric optimization with isotropic gaussian distribution under finite samples

In this article, we theoretically investigate the convergence properties of the information geometric optimization (IGO) algorithm given the family of isotropic Gaussian distributions on the sphere function. Differently from previous studies, where the exact natural gradient is taken, i.e., the infinite samples are assumed, we consider the case that the natural gradient is estimated from finite samples. We derive the rates of the expected decrease of the squared distance to the optimum and the variance parameter as functions of the learning rates, dimension, and sample size. From the rates of decrease deduces that the rates of decreases of the squared distance to the optimum and the variance parameter must agree for geometric convergence of the algorithm. In other words, the ratio between the squared distance to the optimum and the variance must be stable, which is observed empirically but is not derived in the previous theoretical studies. We further derive the condition on the learning rates that the rates of decreases agree and derive the stable value of the ratio. We confirm in simulation that the derived rates of decreases and the stable value of the ratio well approximate the behavior of the IGO algorithm.