# GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference

Full Citation in the ACM Digital Library## SESSION: Evolutionary combinatorial optimization and metaheuristics

### Evolutionary diversity optimization and the minimum spanning tree problem

In the area of evolutionary computation the calculation of diverse sets of high-quality solutions to a given optimization problem has gained momentum in recent years under the term evolutionary diversity optimization. Theoretical insights into the working principles of baseline evolutionary algorithms for diversity optimization are still rare. In this paper we study the well-known Minimum Spanning Tree problem (MST) in the context of diversity optimization where population diversity is measured by the sum of pairwise edge overlaps. Theoretical results provide insights into the fitness landscape of the MST diversity optimization problem pointing out that even for a population of μ = 2 fitness plateaus (of constant length) can be reached, but nevertheless diverse sets can be calculated in polynomial time. We supplement our theoretical results with a series of experiments for the unconstrained and constraint case where all solutions need to fulfill a minimal quality threshold. Our results show that a simple (μ + 1)-EA can effectively compute a diversified population of spanning trees of high quality.

### Real-like MAX-SAT instances and the landscape structure across the phase transition

In contrast with random uniform instances, industrial SAT instances of large size are solvable today by state-of-the-art algorithms. It is believed that this is the consequence of the non-random structure of the distribution of variables into clauses. In order to produce benchmark instances resembling those of real-world formulas with a given structure, generative models have been proposed. In this paper we study the MAX-3SAT problem with model-generated instances having a power-law distribution. Specifically, we target the regions in which computational difficulty undergoes an easy/hard phase transition as a function of clause density and of the power-law exponent. Our approach makes use of a sampling technique to build a graph model (a local optima network) in which nodes are local optima and directed edges are transitions between optima basins. The objective is to relate the structure of the instance fitness landscape with problem difficulty through the transition. We succeed in associating the transition with straightforward network metrics, thus providing a novel and original fitness landscape view of the computational features of the power-law model and its phase transition.

### A graph coloring based parallel hill climber for large-scale NK-landscapes

Efficient hill climbers are at the heart of the latest gray-box optimization techniques,
where some structural information about the optimization problem is available. Focusing
on NK-landscapes as a challenging class of *k*-bounded pseudo-boolean functions, we propose a quality-enhanced and a time-accelerated
hill climber. Our investigations are based on the idea of performing several simultaneous
moves at each iteration of the search process. This is enabled using graph coloring
to structure the interacting variables of an NK-landscape, and to identify subsets
of possibly improving independent moves in the Hamming distance 1 neighborhood. Besides
being extremely competitive with respect to the state-of-the-art first- and best-
ascent serial variants, our initial design exposes a natural degree of parallelism
allowing us to convert our serial algorithm into a parallel hill climber without further
design efforts. As such, we also provide a multi-threaded implementation using up
to 10 shared-memory CPU-cores. Using a range of large-scale random NK-landscapes with
up to 10^{6} variables, we provide evidence on the efficiency and effectiveness of the proposed
hill climber and we highlight the strength of our parallel design in attaining substantial
acceleration factors for the largest experimented functions.

### Unbalanced mallows models for optimizing expensive black-box permutation problems

Expensive black-box combinatorial optimization problems arise in practice when the objective function is evaluated by means of a simulator or a real-world experiment. Since each fitness evaluation is expensive in terms of time or resources, the number of possible evaluations is typically several orders of magnitude smaller than in non-expensive problems. Classical optimization methods are not useful in this scenario. In this paper, we propose and analyze UMM, an estimation-of-distribution (EDA) algorithm based on a Mallows probabilistic model and unbalanced rank aggregation (uBorda). Experimental results on black-box versions of LOP and PFSP show that UMM outperforms the solutions obtained by CEGO, a Bayesian optimization algorithm for combinatorial optimization. Nevertheless, a slight modification to CEGO, based on the different interpretations for rankings and orderings, significantly improves its performance, thus producing solutions that are slightly better than those of UMM and dramatically better than the original version. Another benefit of UMM is that its computational complexity increases linearly with both the number of function evaluations and the permutation size, which results in computation times an order of magnitude shorter than CEGO, making it specially useful when both computation time and number of evaluations are limited.

### On the design and anytime performance of indicator-based branch and bound for multi-objective combinatorial optimization

In this article, we propose an indicator-based branch and bound (I-BB) approach for multi-objective combinatorial optimization that uses a best-first search strategy. In particular, assuming maximizing objectives, the next node to be processed is chosen with respect to the quality of its upper bound. This quality is given by a binary quality indicator, such as the binary hypervolume or the ε-indicator, with respect to the archive of solutions maintained by the branch and bound algorithm. Although the I-BB will eventually identify the efficient set, we are particularly interested in analyzing its anytime behavior as a heuristic. Our experimental results, conducted on a multi-objective knapsack problem with 2, 3, 5, and 7 objectives, indicate that the I-BB can often outperform the naive depth-first and breadth-first search strategies, both in terms of runtime and anytime performance. The improvement is especially significant when the branching order for the decision variables is random, which suggests that the I-BB is particularly relevant when more favorable (problem-dependent) branching orders are not available.

### Generating hard inventory routing problem instances using evolutionary algorithms

This paper studies a method of generating hard instances of the Inventory Routing Problem (IRP) using evolutionary algorithms. The difficulty of the generated instances is measured by the time used by the CPLEX solver to solve a given instance. In order to ensure that feasible problem instances are generated, the Infeasibility Driven Evolutionary Algorithm (IDEA) is used along with seeding of the population with perturbed copies of an instance taken from a well-known set of benchmark instances. In the experiments a significant increase in problem solving time was observed with respect to IRP instances proposed in the literature. Generated IRP instances do not show any unusual properties that would make them unrealistic and they are made available as benchmarks for optimization algorithms. In the paper an attempt was made to compare generated problem instances and to draw conclusions regarding the structure of these instances, however, this has proven to be a difficult task.

### An efficient implementation of iterative partial transcription for the traveling salesman problem

Iterative Partial Transcription (IPT) is an important recombination operator for Traveling
Salesman Problem (TSP), and is a key component in the LKH inexact solver for the TSP.
IPT shares many characteristics with Partition Crossover for the TSP. However, the
standard implementation of IPT searches for all common subchains of sizes from 4 to
*n*/2 between two parent tours and thus has a time complexity of *O*(*n*^{2}) in the worst case. In this paper, an efficient implementation of IPT is proposed
that uses a special data structure called the Extended Edge Table (EET) to find the
recombination components. We prove that the proposed technique is *approximately equivalent* to IPT and has a time complexity of *O*(*n*). The performance of the two implementations of IPT is compared based on some random
and benchmark TSP instances to establish the efficiency of the proposed algorithm.

### Diversifying greedy sampling and evolutionary diversity optimisation for constrained monotone submodular functions

Submodular functions allow to model many real-world optimisation problems. This paper introduces approaches for computing diverse sets of high quality solutions for submodular optimisation problems with uniform and knapsack constraints. We first present diversifying greedy sampling approaches and analyse them with respect to the diversity measured by entropy and the approximation quality of the obtained solutions. Afterwards, we introduce an evolutionary diversity optimisation (EDO) approach to further improve diversity of the set of solutions. We carry out experimental investigations on popular submodular benchmark problems and analyse trade-offs in terms of solution quality and diversity of the resulting solution sets.

### The tiebreaking space of constructive heuristics for the permutation flowshop minimizing makespan

There has been intensive research on tiebreakers for insertion-based constructive heuristics in the permutation flow shop scheduling problem when the objective is to minimize the makespan. In this paper we analyze the space of all possible tie-breaking rules for the most common insertion orders, and evaluate the efficacy of existing tiebreakers based on this analysis. We find that an optimal tie breaker would produce results that are about 1% better than current best tie breakers. We propose a constructive heuristic based on a truncated cyclic best-first search in the space of all tie breakers and show that it closely approximates the solutions that such an optimal tie breaker can achieve.

### Local search pivoting rules and the landscape global structure

In local search algorithms, the pivoting rule determines which neighboring solution to select and thus strongly influences the behavior of the algorithm and its capacity to sample good-quality local optima. The classical pivoting rules are first and best improvement, with alternative rules such as worst improvement and maximum expansion recently studied on hill-climbing algorithms. This article conducts a thorough empirical comparison of five pivoting rules (best, first, worst, approximated worst and maximum expansion) on two benchmark combinatorial problems, NK landscapes and the unconstrained binary quadratic problem (UBQP), with varied sizes and ruggedness. We present both a performance analysis of the alternative pivoting rules within an iterated local search (ILS) framework and a fitness landscape analysis and visualization using local optima networks. Our results reveal that the performance of the pivoting rules within an ILS framework may differ from their performance as single climbers and that worst improvement and maximum expansion can outperform classical pivoting rules.

### Two-stage multi-objective genetic programming with archive for uncertain capacitated arc routing problem

Genetic Programming Hyper-Heuristic (GPHH) is a promising technique to automatically evolve effective routing policies to handle the uncertain environment in the Uncertain Capacitated Arc Routing Problem (UCARP). Previous studies mainly focus on the effectiveness of the evolved routing policies, but the size is ignored. This paper aims to develop new GPHH methods to optimise the effectiveness and the size simultaneously. There are two challenges. First, it is much easier for GP to generate small but ineffective individuals than effective ones, thus the search can be easily stuck with small but ineffective individuals. Second, the effectiveness evaluation in GPHH is stochastic, making it challenging to identify and retain effective individuals. To address these issues, we develop a Two-Stage Multi-Objective GP algorithm with Archive (TSNSGPII-a). The two-stage framework addresses the bias towards the size. The external archive stores potentially effective individuals that may be lost during the evolution, and reuses them to generate offspring. The experimental results show that TSNSGPII-a can obtain significantly better routing policies than the existing state-of-the-art approaches in terms of both effectiveness and size. If selecting the most effective routing policy from the Pareto front, TSNSGPII-a can obtain significantly smaller routing policies with statistically comparable or significantly better effectiveness.

### Genetic algorithm niching by (Quasi-)infinite memory

Genetic Algorithms (GA) explore the search space of an objective function via mutation and recombination. Crucial for the search dynamics is the maintenance of diversity in the population.

Inspired by tabu search we add a mechanism to an elitist GA's selection step to ensure diversification by excluding already visited areas of the search space. To this end, we use Bloom filters as a probabilistic data structure to store a (quasi-)infinite history.

We discuss how this approach fits into the niching idea for GAs, finds an analogy in generational correlation via epi-genetics in biology, and how the approach can be regarded as a co-evolutionary edge case of previous niching techniques.

Furthermore, we apply this new technique to an NP-hard combinatorial optimization problem (ground states of Ising spin glasses). We find by large-scale hyperparameter scans that our elitist GA with quasi-infinite memory consistently outperforms its respective standard GA variant without a Bloom filter.