# GECCO '19- Proceedings of the Genetic and Evolutionary Computation Conference

Full Citation in the ACM Digital Library## SESSION: Evolutionary combinatorial optimization and metaheuristics

### Algorithm selection using deep learning without feature extraction

We propose a novel technique for algorithm-selection which adopts a deep-learning approach, specifically a Recurrent-Neural Network with Long-Short-Term-Memory (RNN-LSTM). In contrast to the majority of work in algorithm-selection, the approach does *not need any features* to be extracted from the data but instead relies on the temporal data sequence as input. A large case-study in the domain of 1-d bin packing is undertaken in which instances can be solved by one of four heuristics. We first evolve a large set of new problem instances that each have a clear "best solver" in terms of the heuristics considered. An RNN-LSTM is trained directly using the sequence data describing each instance to predict the best-performing heuristic. Experiments conducted on small and large problem instances with item sizes generated from two different probability distributions are shown to achieve between 7% to 11% improvement over the single best solver (SBS) (i.e. the single heuristic that achieves the best performance over the instance set) and 0% to 2% lower than the virtual best solver (VBS), i.e the perfect mapping.

### Route planning for cooperative air-ground robots with fuel constraints: an approach based on CMSA

Limited payload capacity on small unmanned aerial vehicles (UAVs) results in restricted flight time. In order to increase the operational range of UAVs, recent research has focused on the use of mobile ground charging stations. The cooperative route planning for both aerial and ground vehicles (GVs) is strongly coupled due to fuel constraints of the UAV, terrain constraints of the GV and the speed differential of the two vehicles. This problem is, in general, an NP-hard combinatorial optimization problem. Existing polynomial-time solution approaches make a trade-off in solution quality for large-scale scenarios and generate solutions with large relative gaps (up to 50%) from known lower bounds. In this work, we employ a hybrid metaheuristic known as Construct, Merge, Solve & Adapt (CMSA) in order to develop a scalable and computationally efficient solution approach. We discuss results for large scale scenarios and provide a comparative analysis with the current state-of-the-art.

### On inversely proportional hypermutations with mutation potential

Artificial Immune Systems (AIS) employing hypermutations with linear static mutation potential have recently been shown to be very effective at escaping local optima of combinatorial optimisation problems at the expense of being slower during the exploitation phase compared to standard evolutionary algorithms. In this paper, we prove that considerable speed-ups in the exploitation phase may be achieved with dynamic inversely proportional mutation potentials (IPM) and argue that the potential should decrease inversely to the distance to the optimum rather than to the difference in fitness. Afterwards, we define a simple (1+1) Opt-IA that uses IPM hypermutations and ageing for realistic applications where optimal solutions are unknown. The aim of this AIS is to approximate the ideal behaviour of the inversely proportional hypermutations better and better as the search space is explored. We prove that such desired behaviour and related speed-ups occur for a well-studied bimodal benchmark function called TwoMax. Furthermore, we prove that the (1+1) Opt-IA with IPM efficiently optimises a second multimodal function, Cliff, by escaping its local optima while Opt-IA with static mutation potential cannot, thus requires exponential expected runtime in the distance between the local and global optima.

### Adaptive large neighborhood search for scheduling of mobile robots

Our work addresses the scheduling of mobile robots for transportation and processing of operations on machines in a flexible manufacturing system. Both mobile robots and automated guided vehicles (AGVs) can transport components among machines in the working space. Nevertheless, the difference is that mobile robots considered in this work can process specific value-added operations, which is not possible for AGVs. This new feature increases complexity as well as computational demands. To summarize, we need to compute a sequence of operations on machines, the robot assignments for transportation, and the robot assignments for processing. The main contribution is the proposal of an adaptive large neighborhood search algorithm with the sets of exploration and exploitation heuristics to solve the problem considering makespan minimization. Experimental evaluation is presented on the existing benchmarks. The quality of our solutions is compared to a heuristic based on genetic algorithm and mixed-integer programming proposed recently. The comparison shows that our approach can achieve comparable results in real time which is in order of magnitude faster than the earlier heuristic.^{1}

### Fast re-optimization via structural diversity

When a problem instance is perturbed by a small modification, one would hope to find a good solution for the new instance by building on a known good solution for the previous one. Via a rigorous mathematical analysis, we show that evolutionary algorithms, despite usually being robust problem solvers, can have unexpected difficulties to solve such re-optimization problems. When started with a random Hamming neighbor of the optimum, the (1+1) evolutionary algorithm takes Ω(*n*^{2}) time to optimize the LeadingOnes benchmark function, which is the same asymptotic optimization time when started in a randomly chosen solution. There is hence no significant advantage from re-optimizing a structurally good solution.

We then propose a way to overcome such difficulties. As our mathematical analysis reveals, the reason for this undesired behavior is that during the optimization structurally good solutions can easily be replaced by structurally worse solutions of equal or better fitness. We propose a simple diversity mechanism that prevents this behavior, thereby reducing the re-optimization time for LeadingOnes to *O*(*γδn*), where *γ* is the population size used by the diversity mechanism and *δ ≤ γ* the Hamming distance of the new optimum from the previous solution. We show similarly fast re-optimization times for the optimization of linear functions with changing constraints and for the minimum spanning tree problem.

### When resampling to cope with noise, use median, not mean

Due to their randomized nature, many nature-inspired heuristics are robust to some level of noise in the fitness evaluations. A common strategy to increase the tolerance to noise is to re-evaluate the fitness of a solution candidate several times and to then work with the average of the sampled fitness values. In this work, we propose to use the median instead of the mean. Besides being invariant to rescalings of the fitness, the median in many situations turns out to be much more robust than the mean. We show that when the noisy fitness is ϵ-concentrated, then a logarithmic number of samples suffice to discover the undisturbed fitness (via the median of the samples) with high probability. This gives a simple metaheuristic approach to transform a randomized optimization heuristics into one that is robust to this type of noise and that has a runtime higher than the original one only by a logarithmic factor. We show further that ϵ-concentrated noise occurs frequently in standard situations. We also provide lower bounds showing that in two such situations, even with larger numbers of samples, the average-resample strategy cannot efficiently optimize the problem in polynomial time.

### A generic approach to districting with diameter or center-based objectives

Districting is the problem of grouping basic geographic units into clusters called districts. Districts are typically required to be geometrically compact, contiguous, and evenly balanced with respect to attributes of the basic units. Though most applications share these core criteria, domain-specific constraints and objective functions are common. This leads to a fragmented literature with different approaches for similar domains and hinders comparisons between these approaches. In this paper we study a unified heuristic approach that can handle three of the most common objective functions concerning compactness: diameter, p-center and p-median, as well as a variable number of balancing attributes. We propose a multistart method which iteratively constructs greedy randomized solutions followed by an improvement phase which alternates between optimizing compactness and satisfying balancing constraints through a series of tabu searches. Experiments show that the proposed method is competitive when compared to approaches in the literature which are specific to each objective, improving known upper bounds in some cases.

### A memetic algorithm approach to network alignment: mapping the classification of mental disorders of DSM-IV with ICD-10

Given two graphs modelling related, but possibly distinct, networks, the alignment of the networks can help identify significant structures and substructures which may relate to the functional purpose of the network components. The Network Alignment Problem is the NP-hard computational formalisation of this goal and is a useful technique in a variety of data mining and knowledge discovery domains. In this paper we develop a memetic algorithm to solve the Network Alignment Problem and demonstrate the effectiveness of the approach on a series of biological networks against the existing state of the art alignment tools. We also demonstrate the use of network alignment as a clustering and classification tool on two mental health disorder diagnostic databases.

### Characterising the rankings produced by combinatorial optimisation problems and finding their intersections

The aim of this paper is to introduce the concept of intersection between combinatorial optimisation problems. We take into account that most algorithms, in their machinery, do not consider the exact objective function values of the solutions, but only a comparison between them. In this sense, if the solutions of an instance of a combinatorial optimisation problem are sorted into their objective function values, we can see the instances as (partial) rankings of the solutions of the search space. Working with specific problems, particularly, the linear ordering problem and the symmetric and asymmetric traveling salesman problem, we show that they can not generate the whole set of (partial) rankings of the solutions of the search space, but just a subset. First, we characterise the set of (partial) rankings each problem can generate. Secondly, we study the intersections between these problems: those rankings which can be generated by both the linear ordering problem and the symmetric/asymmetric traveling salesman problem, respectively. The fact of finding large intersections between problems can be useful in order to transfer heuristics from one problem to another, or to define heuristics that can be useful for more than one problem.

### Predict or screen your expensive assay: DoE vs. surrogates in experimental combinatorial optimization

Statistics-based Design of Experiments (DoE) methodologies are considered the gold standard in existing laboratory equipment for *screening* predefined experimental assays. Thus far, very little is formally known about their effectiveness in light of global optimization, particularly when compared to Evolutionary Algorithms (EAs). The current study, which was ignited by *evolution-in-the-loop* of functional protein expression, aims to conduct such a comparison with a focus on Combinatorial Optimization considering a dozen of decision variables, a search-space cardinality of several million combinations, under a budget of a couple of thousands of function evaluations. Due to the limited budget of evaluations, we argue that surrogate-assisted search methods become relevant for this application domain. To this end, we study a specific so-called Categorical Evolution Strategy (CatES). We systematically compare its performance, with and without being assisted by state-of-the-art surrogates, to DoE-based initialization of the search (exploiting the budget partially or entirely). Our empirical findings on noise-free benchmarks show that the surrogate-based approach is superior, as it significantly outperforms the DoE techniques and the CatES alone on the majority of the problems subject to the budget constraint. We conclude by projecting the strengths and weaknesses of EAs *versus* DoE, when run either directly or surrogate-aided.

### A characterisation of S-box fitness landscapes in cryptography

Substitution Boxes (S-boxes) are nonlinear objects often used in the design of cryptographic algorithms. The design of high quality S-boxes is an interesting problem that attracts a lot of attention. Many attempts have been made in recent years to use heuristics to design S-boxes, but the results were often far from the previously known best obtained ones. Unfortunately, most of the effort went into exploring different algorithms and fitness functions while little attention has been given to the understanding why this problem is so difficult for heuristics. In this paper, we conduct a fitness landscape analysis to better understand why this problem can be difficult. Among other, we find that almost each initial starting point has its own local optimum, even though the networks are highly interconnected.

### An improved merge search algorithm for the constrained pit problem in open-pit mining

Conventional mixed-integer programming (MIP) solvers can struggle with many large-scale combinatorial problems, as they contain too many variables and constraints. Meta-heuristics can be applied to reduce the size of these problems by removing or aggregating variables or constraints. Merge search algorithms achieve this by generating populations of solutions, either by heuristic construction [4], or by finding neighbours to an initial solution [12]. This paper presents a merge search algorithm that improves the population generation heuristic in [12] and utilises a variable grouping heuristic that exploits the common information across a population to aggregate groups of variables in order to create a reduced subproblem. The algorithm is tested on some well known benchmarks for a complex problem called the constrained pit (CPIT) problem and it is compared to results produced by a merge search algorithm previously used on the same problem and the results published on the *minelib* [9] website.

### Walsh functions as surrogate model for pseudo-boolean optimization problems

Surrogate-modeling is about formulating quick-to-evaluate mathematical models, to approximate black-box and time-consuming computations or simulation tasks. Although such models are well-established to solve continuous optimization problems, very few investigations regard the optimization of combinatorial structures. These structures deal for instance with binary variables, allowing each compound in the representation of a solution to be activated or not. Still, this field of research is experiencing a sudden renewed interest, bringing to the community fresh algorithmic ideas for growing these particular surrogate models. This article proposes the first surrogate-assisted optimization algorithm (WSaO) based on the mathematical foundations of discrete Walsh functions, combined with the powerful grey-box optimization techniques in order to solve pseudo-boolean optimization problems. We conduct our experiments on a benchmark of combinatorial structures and demonstrate the accuracy, and the optimization efficiency of the proposed model. We finally highlight how Walsh surrogates may outperform the state-of-the-art surrogate models for pseudo-boolean functions.

### Fitness comparison by statistical testing in construction of SAT-based guess-and-determine cryptographic attacks

Algebraic cryptanalysis studies breaking ciphers by solving algebraic equations. Some of the promising approaches use SAT solvers for this purpose. Although the corresponding satisfiability problems are hard, their difficulty can often be lowered by choosing a set of variables to brute force over, and by solving each of the corresponding reduced problems using a SAT solver, which is called the guess-and-determine attack. In many successful cipher breaking attempts this set was chosen analytically, however, the nature of the problem makes evolutionary computation a good choice.

We investigate one particular method for constructing guess-and-determine attacks based on evolutionary algorithms. This method estimates the fitness of a particular guessed bit set by Monte-Carlo simulations. We show that using statistical tests within the comparator of fitness values, which can be used to reduce the necessary number of samples, together with a dynamic strategy for the upper limit on the number of samples, speeds up the attack by a factor of 1.5 to 4.3 even on a distributed cluster.

### Application of CMSA to the minimum capacitated dominating set problem

This work deals with the so-called minimum capacitated dominating set (CAPMDS) problem, which is an NP-Hard combinatorial optimization problem in graphs. In this paper we describe the application of a recently introduced hybrid algorithm known as Construct, Merge, Solve & Adapt (CMSA) to this problem. Moreover, we evaluate the performance of a standalone ILP solver. The results show that both CMSA and the ILP solver outperform current state-of-the-art algorithms from the literature. Moreover, in contrast to the ILP solver, the performance of CMSA does not degrade for the largest problem instances. The experimental evaluation is based on a benchmark dataset containing two different graph topologies and considering graphs with variable and uniform node capacities.

### Investigation of the traveling thief problem

The Traveling Thief Problem (TTP) is a relatively new benchmark problem created to study problems which consist of interdependent subproblems. In this paper we investigate what the fitness landscape characteristics are of some smaller instances of the TTP with commonly used local search operators in the context of metaheuristics that uses local search. The local search operators include: 2-opt, Insertion, Bitflip and Exchange and metaheuristics include: multi-start local search, iterated local search and genetic local search. Fitness landscape analysis shows among other things that TTP instances contain a lot of local optima but their distance to the global optimum is correlated with its fitness. Local optima networks with respect to an iterated local search reveals that TTP has a multi-funnel structure. Other experiments show that a steady state genetic algorithm with edge assembly crossover outperforms multi-start local search, iterated local search and genetic algorithms with different tour crossovers. At last we performed a comparative study using the genetic algorithm with edge assembly crossover on relatively larger instances of the commonly used benchmark suite. As a result we found new best solutions to almost all studied instances.

### Evolutionary algorithms for the chance-constrained knapsack problem

Evolutionary algorithms have been widely used for a range of stochastic optimization problems. In most studies, the goal is to optimize the expected quality of the solution. Motivated by real-world problems where constraint violations have extremely disruptive effects, we consider a variant of the knapsack problem where the profit is maximized under the constraint that the knapsack capacity bound is violated with a small probability of at most *α.* This problem is known as chance-constrained knapsack problem and chance-constrained optimization problems have so far gained little attention in the evolutionary computation literature. We show how to use popular deviation inequalities such as Chebyshev's inequality and Chernoff bounds as part of the solution evaluation when tackling these problems by evolutionary algorithms and compare the effectiveness of our algorithms on a wide range of chance-constrained knapsack instances.

### A two-stage genetic programming hyper-heuristic approach with feature selection for dynamic flexible job shop scheduling

Dynamic flexible job shop scheduling (DFJSS) is an important and a challenging combinatorial optimisation problem. Genetic programming hyper-heuristic (GPHH) has been widely used for automatically evolving the routing and sequencing rules for DFJSS. The terminal set is the key to the success of GPHH. There are a wide range of features in DFJSS that reflect different characteristics of the job shop state. However, the importance of a feature can vary from one scenario to another, and some features may be redundant or irrelevant under the considered scenario. Feature selection is a promising strategy to remove the unimportant features and reduce the search space of GPHH. However, no work has considered feature selection in GPHH for DFJSS so far. In addition, it is necessary to do feature selection for the two terminal sets simultaneously. In this paper, we propose a new two-stage GPHH approach with feature selection for evolving routing and sequencing rules for DFJSS. The experimental studies show that the best solutions achieved by the proposed approach are better than that of the baseline method in most scenarios. Furthermore, the rules evolved by the proposed approach involve a smaller number of unique features, which are easier to interpret.