# GECCO '18- Proceedings of the Genetic and Evolutionary Computation Conference Companion

Full Citation in the ACM Digital Library## WORKSHOP SESSION: Workshop papers: workshop genetic improvement 2018

### Dynamic fitness functions for genetic improvement in compilers and interpreters

When attempting to improve the non-functional requirements of software, specifically run-time performance of code, an important requirement is to preserve the correctness of the optimized code. Additionally when attempting to integrate Genetic Improvement into a compiler or interpreter, the large search spaces resulting from the amount of operators and operands a language provides needs to be dealt with. This publication explores dynamic fitness functions as a foundation for a use in Genetic Improvement to optimize programs. An approach of using a test suite to verify code correctness in the Truffle Framework [19, 20] and Graal Compiler [11] is presented. Two types of fitness functions are explored, which split the test suite according to their complexity and attempt to generate correct solutions with a growing set of increasingly complex tests. One of them increases the amount of tests sequentially over several iterations. The parallel fitness function attempts to split a test suite and to re-combine the results with increasingly large suites. The results show that these functions only marginally improve the fitness landscape on their own, but show that more partially correct solutions can be found with dynamic fitness functions. In the future, our approach may be improved by implementing specific crossover and mutator operations to accompany the dynamic fitness functions.

### Novelty search for software improvement of a SLAM system

Genetic Improvement (GI) performs a search at the level of source code to find the best variant of a baseline system that improves non-functional properties while maintaining functionality with noticeable results in several domains. There a many aspects of this general approach that are currently being explored. In particular, this work deals to the way in which the search is guided to efficiently explore the search space of possible software versions in which GI operates. The proposal is to integrate Novelty Search (NS) within the GISMOE GI framework to improve KinectFusion, which is a vision-based Simultaneous Localization and Mapping (SLAM) system that is used for augmented reality, autonomous vehicle navigation, and many other real-world applications. This is one of a small set of works that have successfully combined NS with a GP system, and the first time that it has been used for software improvement. To achieve this, we propose a new behaviour descriptor for SLAM algorithms, based on state-of-the-art benchmarking and present results that show that NS can produce significant improvement gains in a GI setting, when considering execution time and trajectory estimation as the main performance criteria.

### Assessing single-objective performance convergence and time complexity for refactoring detection

The automatic detection of refactoring recommendations has been tackled in prior optimization studies involving bad code smells, semantic coherence and importance of classes; however, such studies informally addressed formalisms to standardize and replicate refactoring models. We propose to assess the refactoring detection by means of performance convergence and time complexity. Since the reported approaches are difficult to reproduce, we employ an Artificial Refactoring Generation (ARGen) as a formal and naive computational solution for the Refactoring Detection Problem. ARGen is able to detect massive refactoring's sets in feasible areas of the search space. We used a refactoring formalization to adapt search techniques (Hill Climbing, Simulated Annealing and Hybrid Adaptive Evolutionary Algorithm) that assess the *performance* and *complexity* on three open software systems. Combinatorial techniques are limited in solving the Refactoring Detection Problem due to the relevance of developers' criteria (human factor) when designing reconstructions. Without performance convergence and time complexity analysis, a software empirical analysis that utilizes search techniques is incomplete.

### Towards modular large-scale darwinian software improvement

This paper proposes to explore a software engineer-assisted method for evolutionarily improving large-scale software systems. A framework is outlined for selecting and evolving specific components of such systems, while avoiding treating the complete software as a single independent individual in the population, thereby forgoing the high costs of that approach.

### Synthesizing customized network protocols using genetic programming

Given the advances in areas such as home automation, agricultural networks, smart cities, designers often need to design protocols that utilize the features of that network while dealing with its limitations. Utilizing standardized protocols for such networks may not be appropriate as they may not address limitations of the network such as heterogeneous nodes or limited capability of some nodes. While designing a customized protocol for that network would be desirable, it is extremely time-consuming unless we can automate the development of the required protocol. In this paper, we present NetSynth, a GP based mechanism to develop customized routing protocol for the given network. NetSynth lets us conveniently describe a network using an XML file, and it synthesizes a routing protocol that suits the input network by considering the characteristics specific to the given network. We show how NetSynth creates protocols that perform comparably to best-known protocols for the case where we want to broadcast a set of messages to all nodes in a grid. We also show how NetSynth helps us design protocols that provide a tradeoff between throughput and energy.

### Genetic configuration sampling: learning a sampling strategy for fault detection of configurable systems

A highly-configurable system provides many configuration options to diversify application scenarios. The combination of these configuration options results in a large search space of configurations. This makes the detection of configuration-related faults extremely hard. Since it is infeasible to exhaust every configuration, several methods are proposed to sample a subset of all configurations to detect hidden faults. Configuration sampling can be viewed as a process of repeating a pre-defined sampling action to the whole search space, such as the *one-enabled* or *pair-wise* strategy.

In this paper, we propose genetic configuration sampling, a new method of learning a sampling strategy for configuration-related faults. Genetic configuration sampling encodes a sequence of sampling actions as a chromosome in the genetic algorithm. Given a set of known configuration-related faults, genetic configuration sampling evolves the sequence of sampling actions and applies the learnt sequence to new configuration data. A pilot study on three highly-configurable systems shows that genetic configuration sampling performs well among nine sampling strategies in comparison.

## WORKSHOP SESSION: Workshop papers: workshop genetic and evolutionary computation in defense, security and risk management

### A genetic algorithm for dynamic controller placement in software defined networking

The Software Defined Networking paradigm has enabled dynamic configuration and control of large networks. Although the division of the control and data planes on networks has lead to dynamic reconfigurability of large networks, finding the minimal and optimal set of controllers that can adapt to the changes in the network has proven to be a challenging problem. Recent research tends to favor small solution sets with a focus on either propagation latency or controller load distribution, and struggles to find large balanced solution sets. In this paper, we propose a multi-objective genetic algorithm based approach to the controller placement problem that minimizes inter-controller latency, load distribution and the number of controllers with fitness sharing. We demonstrate that the proposed approach provides diverse and adaptive solutions to real network architectures such as the United States backbone and Japanese backbone networks. We further discuss the relevance and application of a diversity focused genetic algorithm for a moving target defense security model.

### Evolution of network enumeration strategies in emulated computer networks

Successful attacks on computer networks today do not often owe their victory to directly overcoming strong security measures set up by the defender. Rather, most attacks succeed because the number of possible vulnerabilities are too large for humans to fully protect without making a mistake. Regardless of the security elsewhere, a skilled attacker can exploit a single vulnerability in a defensive system and negate the benefits of those security measures. This paper presents an evolutionary framework for evolving attacker agents in a real, emulated network environment using genetic programming, as a foundation for coevolutionary systems which can automatically discover and mitigate network security flaws. We examine network enumeration, an initial network reconnaissance step, through our framework and present results demonstrating its success, indicating a broader applicability to further cyber-security tasks.

### Adversarial co-evolution of attack and defense in a segmented computer network environment

In computer security, guidance is slim on how to prioritize or configure the many available defensive measures, when guidance is available at all. We show how a competitive co-evolutionary algorithm framework can identify defensive configurations that are effective against a range of attackers. We consider *network segmentation*, a widely recommended defensive strategy, deployed against the threat of serial network security attacks that delay the mission of the network's operator. We employ a simulation model to investigate the effectiveness over time of different defensive strategies against different attack strategies. For a set of four network topologies, we generate strong availability attack patterns that were not identified *a priori.* Then, by combining the simulation with a co-evolutionary algorithm to explore the adversaries' action spaces, we identify effective configurations that minimize mission delay when facing the attacks. The novel application of co-evolutionary computation to enterprise network security represents a step toward course-of-action determination that is robust to responses by intelligent adversaries.^{1}

### Real-time strategy game micro for tactical training simulations

Complex, realistic scenarios in training simulations can benefit from good control of large numbers of simulation entities. However, training simulations typically focus on simulation physics and graphics over the intelligence required to control large numbers of entities. Real-Time Strategy games, on the other hand, have evolved to make tradeoffs between the AI needed and human interaction required to control hundreds of entities in complex tactical skirmishes. Borrowing from work in real-time strategy games, this paper attacks the problem of controlling groups of heterogenous entities in training simulations by using a genetic algorithm to evolve control algorithm parameters that maximize damage done and minimize damage received during skirmishes in a real-time strategy game-like simulation. Results show the emergence of complex, coordinated behavior among groups of simulated entities. Evolved behavior quality seems to be relatively independent of the underlying physics model but depends on the initial dispositions of entities in the simulation. We can get over this dependence and evolve more robust high performance behaviors by evaluating fitness in several different scenarios with different initial dispositions. We believe these preliminary results indicate the viability of our approach for generating robust, high performance behaviors for controlling swarms of entities in training simulations to enable more complex, realistic training scenarios.

### Machine learning: based detection of water contamination in water distribution systems

Accurate detection of natural or intentional contamination events in water distribution pipes is critical to drinking water safety. Efficient early warning systems that can detect contamination events require detection algorithms that can accurately predict the occurrence of such events. This paper presents the development of adaptive neuro-fuzzy inference system (ANFIS) models for detecting the safety condition of water in pipe networks when concentrations of water quality variables in the pipes exceed their maximum thresholds. The event detection is based on time series data composed of pH, turbidity, color and bacteria count measured at the effluent of a drinking water utility and nine different locations of sensors in the distribution network in the city of Ålesund, Norway. The proposed ANFIS models correctly detected between 92% and 96% of the safety condition of the water in the pipe network, with approximately 1% false alarm rate during the testing stage. The models also achieved high rates of specificity and precision, with very high correlations between the predictions and actual conditions of the water in the pipes. The accuracy of the models achieved in this study suggests that the models can be useful in real time contamination event detection in the pipe networks.

### Using evolutionary dynamic optimization for monitor selection in highly dynamic communication infrastructures

In this paper, we address the problem of applying evolutionary dynamic optimization of network monitoring to highly dynamic communication network infrastructures.

One major challenge of modern communication networks is the increasing volatility due to, e.g., changing availability of nodes and links, load of paths, or attacks. While optimization of those dynamic networks has been an important application area since decades, new developments in the area of network function virtualization and software defined network facilitate a completely new level of automated dynamic network optimization. Especially in mobile networks, changes can be observed to appear swiftly. Thus, using population-based heuristics becomes challenging as reevaluation of all candidate solutions may become time-wise impossible and operations need to rely on possibly obsolete fitness values.

Here, an established method has been applied to solve the dynamic monitor selection problem on multiple real-world problem instances using a different simulated level of change. Statistically significant results of the proposed method have been compared to the performance of a *best-of-multiple selection local search (EMS LS)* heuristic. As the results show, optimization reaches results of high quality even under difficult circumstances.

### Automated design of network security metrics

Many abstract security measurements are based on characteristics of a graph that represents the network. These are typically simple and quick to compute but are often of little practical use in making real-world predictions. Practical network security is often measured using simulation or real-world exercises. These approaches better represent realistic outcomes but can be costly and time-consuming. This work aims to combine the strengths of these two approaches, developing efficient heuristics that accurately predict attack success. Hyper-heuristic machine learning techniques, trained on network attack simulation training data, are used to produce novel graph-based security metrics. These low-cost metrics serve as an approximation for simulation when measuring network security in real time. The approach is tested and verified using a simulation based on activity from an actual large enterprise network. The results demonstrate the potential of using hyper-heuristic techniques to rapidly evolve and react to emerging cybersecurity threats.

### Genetic algorithms for role mining in critical infrastructure data spaces

In the paper, a Role Mining problem, which is the cornerstone for creating Role-Based Access Control (RBAC) systems, is transferred to the domain of data spaces. RBAC is the most widespread model of access control in different multi-user information systems, including critical infrastructures. The data spaces is the perspective concept of creating information storage systems, which transforms the concept of databases, integrating in one system the information resources from other systems, and allows us to control their security on a centralized basis. The paper considers a mathematical statement of the RBAC design problem for data spaces and offers the approaches to its solving based on genetic algorithms. The proposed approaches consider requirements of compliance with role-based security policies in case of combining all users' sets and all permissions' sets in the data space. The paper considers main decisions on creation and enhancement of genetic algorithms which implementation increases their operational speed. The experimental assessment of the offered genetic algorithms shows their high performance.

## WORKSHOP SESSION: Workshop papers: workshop real-world applications of continuous and mixed-integer optimization

### Well placement optimization for carbon dioxide capture and storage via CMA-ES with mixed integer support

Carbon dioxide Capture and Storage (CCS) is a viable technique for reducing CO_{2} emitted to the atmosphere. A simulation based optimization of well placement is a promising solution to geologic CO_{2} storage (GCS), which is a part of CCS. Covariance matrix adaptation evolution strategy (CMA-ES) is considered to apply for well placement problem because it is a state-of-the-art black-box continuous optimization algorithm. However, insufficient search by the algorithm is anticipated since well placement problem often forms a mixed integer programming problem. In this paper, we investigate the use of variants of CMA-ES to the optimization of well placement and injection schedule as a mixed integer programming problem. First we investigate the effect of each algorithmic component to treat integer variables on mixed integer programming test functions. Then, some promising variants are applied to the well placement and injection scheduling problem for a CCS project. We observed that the CMA-ES with step-size lower bound behaved robust and found better solutions than the variants without the bound, independently of initial search points. We bring up some issues of current optimization framework including the mixed integer support in the CMA-ES and the formulation of the GCS optimization problem.

### On vehicle surrogate learning with genetic programming ensembles

Learning surrogates for product design and optimization is potential to capitalize on competitive market segments. In this paper we propose an approach to learn surrogates of product performance from historical clusters by using ensembles of Genetic Programming. By using computational experiments involving more than 500 surrogate learning instances and 27,858 observations of vehicle models collected over the last thirty years shows (1) the feasibility to learn function surrogates as symbolic ensembles at different levels of granularity of the hierarchical vehicle clustering, (2) the direct relationship of the predictive ability of the learned surrogates in both seen (training) and unseen (testing) scenarios as a function of the number of cluster instances, and (3) the attractive predictive ability of relatively smaller ensemble of trees in unseen scenarios. We believe our approach introduces the building blocks to further advance on studies regarding data-driven product design and market segmentation.

## WORKSHOP SESSION: Workshop papers: workshop decomposition techniques in evolutionary optimization

### A historical interdependency based differential grouping algorithm for large scale global optimization

Cooperative co-evolution (CC) is a powerful evolutionary computation framework for solving large scale global optimization (LSGO) problems via the strategy of "divide-and-conquer", but its efficiency highly relies on the decomposition result. Existing decomposition algorithms either cannot obtain correct decomposition results or require a large number of fitness evaluations (FEs). To alleviate these limitations, this paper proposes a new decomposition algorithm named historical interdependency based differential grouping (HIDG). HIDG detects interdependency from the perspective of vectors. By utilizing historical interdependency information, it develops a novel criterion which can directly deduce the interdependencies among some vectors without consuming extra FEs. Coupled with an existing vector-based decomposition framework, HIDG further significantly reduces the total number of FEs for decomposition. Experiments on two sets of LSGO benchmark functions verified the effectiveness and efficiency of HIDG.

### A cooperative co-evolutionary algorithm for large-scale multi-objective optimization problems

A wide range of real-world problems are multi-objective optimization problems (MOPs). Multi-objective evolutionary algorithms (MOEAs) have been proposed to solve MOPs, but the search process deteriorates with the increase of MOPs' dimension of decision variables. In order to solve the problem, firstly, the decision variables are divided into different groups by adopting a fast interdependency identification algorithm; secondly, a novel cooperative co-evolutionary algorithm is used to solve MOPs. Experiment results on large-scale problems show that the proposed algorithm is effective.

### Selfish vs. global behavior promotion in car controller evolution

We consider collective tasks to be solved by simple agents synthesized automatically by means of neuroevolution. We investigate whether driving neuroevolution by promoting a form of selfish behavior, i.e., by optimizing a fitness index that synthesizes the behavior of each agent independent of any other agent, may also result in optimizing global, system-wide properties. We focus on a specific and challenging task, i.e., evolutionary synthesis of agent as car controller for a road traffic scenario. Based on an extensive simulation-based analysis, our results indicate that even by optimizing the behavior of each single agent, the resulting system-wide performance is comparable to the performance resulting from optimizing the behavior of the system as a whole. Furthermore, agents evolved with a fitness promoting selfish behavior appear to lead to a system that is globally more robust with respect to the presence of unskilled agents.

### Decomposition-based multiobjective particle swarm optimization for change detection in SAR images

Owing to the immunity to illumination and atmospheric conditions, synthetic aperture radar (SAR) images have been the main source of data for environmental monitoring. However, it is a challenging task for change detection because of the influence of speckle noise. In this paper, we propose an unsupervised multiobjective particle swarm optimization approach based on decomposition for change detection in SAR images. For the change detection task, it can be modeled as a multiobjective optimization problem (MOP), which consists of two contradictory objectives, namely, retaining image change details and removing the speckle noise. We optimize this MOP by using particle swarm optimization, which decomposes it into a set of subproblems by assigning different weights to these two objectives, thus obtaining the optimal trade-off between them. To accurately identify changed regions, the strategy of majority voting is employed to assemble partial good solutions to generate the final change detection result. The impressive experimental results on both real data sets have demonstrated the effectiveness and superiority of the proposed method.

## WORKSHOP SESSION: Workshop papers: workshop black box discrete optimization benchmarking

### Parameterization of state-of-the-art performance indicators: a robustness study based on inexact TSP solvers

Performance comparisons of optimization algorithms are heavily influenced by the underlying indicator(s). In this paper we investigate commonly used performance indicators for single-objective stochastic solvers, such as the *Penalized Average Runtime (e.g., PAR10)* or the *Expected Running Time (ERT)*, based on exemplary benchmark performances of state-of-the-art inexact TSP solvers. Thereby, we introduce a methodology for analyzing the effects of (usually heuristically set) indicator parametrizations - such as the penalty factor and the method used for aggregating across multiple runs - w.r.t. the robustness of the considered optimization algorithms.

### Discrete real-world problems in a black-box optimization benchmark

Combinatorial optimization problems come in a wide variety of types but five common problem components can be identified. This categorization can aid the selection of interesting and diverse set of problems for inclusion in the combinatorial black-box problem benchmark. We suggest two real-world problems for inclusion into the benchmark. One is a transport-lot building problem and the other one is the clustered generalized quadratic assignment problem. We look into designing an interface for discrete black-box problems that can accommodate problems belonging to all of the described categories as well real-world problems that often feature multiple problem components. We describe three different interfaces for black-box problems, the first using a general encoding for all types of problems the second one using specialized encodings per problem type and the last one describes problems in terms of the available operators. We compare the strengths and weaknesses of the three designs.

### Compiling a benchmarking test-suite for combinatorial black-box optimization: a position paper

This contribution focuses on the challenge of formulating a set of benchmark problems and/or a test-suite for Combinatorial Optimization problems when treated as black-box global optimization problems. We discuss the involved dilemmas and possible obstacles of such a compilation. To this end, we formulate a list of design questions that need to be answered as a first step in this compilation process. We articulate our perspective on these questions by proposing a rough classification of relevant problem classes, answering the posed questions, and suggesting a preliminary set of problems. While this position paper addresses the Evolutionary Computation community, it intends to offer an open-minded Computer Science perspective - by considering the broad definition of Combinatorial Optimization and by accounting for equivalent threads within Operations Research and Mathematical Programming communities. At the same time, this work capitalizes on prior art in algorithms' benchmarking, including the authors' own experience with the continuous BBOB benchmark problem set, as well as a number of discrete black-box optimization challenges frequently encountered in practice.

### A generic problem instance generator for discrete optimization problems

Measuring the performance of an optimization algorithm involves benchmark instances of related problems. In the area of discrete optimization, most well-known problems are covered by a large variety of problem instances already. However, while exploring the area of lesser-known optimization problems there is usually not a sufficient amount or variety of such benchmark instances available. The reasons for this lack of data vary from privacy or confidentiality issues to technical difficulties that prevent the collection of such data. This results in the inability to evaluate new optimization algorithms on these problems. Ideally, the necessary data for a variety of problem instances can be created randomly in advance to measure the performance of a set of algorithms. Random problem instance generators exist for specific problems already, however, generic tools that are applicable to multiple optimization problems are rare and usually restricted to a smaller subset of problems. We propose a generic problem instance generator for discrete optimization problems, which is easy to configure, and simple to expand to cover a broad variety of optimization problems. We show the capabilities of our tool by creating exemplary configurations for the TSP, Max-SAT and a real-world load allocation problem to generate random problem instances.

###
Difficult features of combinatorial optimization problems and the tunable *w-model* benchmark problem for simulating them

The first event of the Black-Box Discrete Optimization Benchmarking (BB-DOB) workshop series aims to establish a set of example problems for benchmarking black-box optimization algorithms for discrete or combinatorial domains. In this paper, we *1)* discuss important features that should be embodied by these benchmark functions and *2)* present the *W-Model* problem which exhibits them. The *W-Model* follows a layered approach, where each layer can either be omitted or introduce a different characteristic feature such as neutrality via redundancy, ruggedness and deceptiveness, epistasis, and multi-objectivity, in a tunable way. The model problem is defined over bit string representations, which allows for extracting some of its layers and stacking them on top of existing problems that use this representation, such as OneMax, the Maximum Satisfiability or the Set Covering tasks, and the NK landscape. The ruggedness and deceptiveness layer can be stacked on top of any problem with integer-valued objectives. We put the *W-Model* into the context of related model problems targeting ruggedness, neutrality, and epistasis. We then present the results of a series of experiments to further substantiate the utility of the *W-Model* and to give an idea about suitable configurations of it that could be included in the BB-DOB benchmark suite.

### A black-box discrete optimization benchmarking (BB-DOB) pipeline survey: taxonomy, evaluation, and ranking

This paper provides a taxonomical identification survey of classes in discrete optimization challenges that can be found in the literature including a proposed pipeline for benchmarking, inspired by previous computational optimization competitions. Thereby, a Black-Box Discrete Optimization Benchmarking (BB-DOB) perspective is presented for the [email protected] Workshop. It is motivated why certain classes together with their properties (like deception and separability or toy problem label) should be included in the perspective. Moreover, guidelines on how to select significant instances within these classes, the design of experiments setup, performance measures, and presentation methods and formats are discussed.

## WORKSHOP SESSION: Workshop papers: workshop evolutionary algorithms for problems with uncertainty

### Robust multi-modal optimisation

Robust and multi-modal optimisation are two important topics that have received significant attention from the evolutionary computation community over the past few years. However, the two topics have usually been investigated independently and there is a lack of work that explores the important intersection between them. This is because there are real-world problems where both formulations are appropriate in combination. For instance, multiple 'good' solutions may be sought which are distinct in design space for an engineering problem - where error between the computational model queried during optimisation and the real engineering environment is believed to exist (a common justification for multi-modal optimisation) - but also engineering tolerances may mean a realised design might not exactly match the inputted specification (a robust optimisation problem). This paper conducts a preliminary examination of such intersections and identifies issues that need to be addressed for further advancement in this new area. The paper presents initial benchmark problems and examines the performance of combined state-of-the-art methods from both fields on these problems.

### A framework for high-dimensional robust evolutionary multi-objective optimization

This paper proposes a framework for solving high-dimensional robust multi-objective optimization problems. A decision variable classification-based framework is developed to search for robust Pareto-optimal solutions. The decision variables are classified as highly and weakly robustness-related variables based on their contributions to the robustness of candidate solutions. In the case study, an order scheduling problem in the apparel industry is investigated via the proposed framework. The experimental results reveal that the performance of robust evolutionary optimization can be greatly improved via analyzing the properties of decision variables and then decomposing the high-dimensional robust multi-objective optimization problem.

### Exploration of the effect of uncertainty in homogeneous and heterogeneous multi-agent societies with regard to their average characteristics

In electrical engineering, the deviation from average values of a signal is viewed as noise to the useful measurement. In human societies, however, the diversity of the exhibited characteristics are a sign of individuality and personal worth. We investigate the effect of uncertainty variables in the environment on multi-agent societies (MAS) and the consequences of the deviation, from the average features of the modeled agents. We show the performance of heterogeneous MAS of agents in comparison to morphologically identical homogeneous systems, preserving the same average physical and sensory abilities for the system as a whole, in a dynamic environment. We are employing a form of the predator-prey pursuit problem in attempt to measure the different performance of homogeneous MAS with average parameters and its heterogeneous counterpart. The effects of uncertainty in our work is investigated from the viewpoint of (i) employing a limited number of initial situations to evolve the team of predator agents, (ii) generality to unforeseen initial situations, and (iii) robustness to perception noise. Key statistics are the efficiency of evolution of the successful behavior of predator agents, effectiveness of their behavior and its degradation because of newly introduced situation or noise. Preliminary results indicate that a heterogeneous system can be at least as good as its homogeneous average equivalent, in solution quality at the expense of the runtime of evolution.

## WORKSHOP SESSION: Workshop papers: workshop intelligent operations management in the energy sector

### Multiobjective evolutionary polygonal approximation for identifying crude oil reservoirs

This work formalizes a multi-objective evolutionary approach for the segmentation issue according to Piecewise Linear Representation. It consists in the approximation of a given digital curve by a set of linear models minimizing the representation error and the number of such models. This solution allows the final user to decide from the best array of best found solutions considering the different objectives jointly. The proposed approach eliminates the difficult *a-priori* parameter choices in order to satisfy the user restrictions (the solution choice is performed *a-posteriori*, from the obtained array of solutions) and allows the algorithm to be run a single time (since the whole Pareto front is obtained with a single run and different solutions may be chosen at different times from that Pareto front in order to satisfy different requirements). This solution will be applied to Petroleum Industry in particular to the problem of identifying resources from extraction areas in order to optimize their operational costs and production capacity.

### Towards bundling minimal trees in polygonal maps

Minimal trees in polygonal maps aim at minimizing the connectivity in a network while avoiding obstacle collision. Being closely related to the Steiner Tree Problem, yet with a different scope, minimal trees aim at connecting origin-destination pairs, given in a bipartite network, to allow the joint transport of information, goods, resources and people. In this paper, we propose a method to tackle the bundling problem of minimal trees in modular bipartite networks by using a two-layer optimization based on Differential Evolution with a convex representation of coordinates. Our computational experiments in polygonal domains considering both convex and non-convex geometry show the feasibility and the efficiency of the proposed approach.

### Crude oil refinery scheduling: addressing a real-world multiobjective problem through genetic programming and dominance-based approaches

This study presents the crude oil scheduling problem with four objectives divided in two different levels of importance. It comes from a real refinery where the scheduling starts on the offloading of ships, encompasses terminal and refinery tanks, a crude pipeline, and finishes on the output streams of the crude distillation units. We propose a new approach for the Quantum-Inspired Grammar-based Linear Genetic Programming (QIGLGP) evolutionary algorithm to handle the multiple objectives of the problem using the non-dominance concept. The modifications are concentrated on the population updating and sorting steps of QIGLGP. We tackle difference of importance among the objectives using the principle of violation of constraints. The problem constraints define if an instruction will or not be executed but do not affect the violation equation of the objectives. The individuals which have objective values under a pre-defined upper limit are better ranked. Results from five scenarios showed that the proposed model was able to significantly increase the percentage of runs with acceptable solutions, achieving success ratio of 100% in 3 cases and over 70% in 2 other ones. They also show that the Pareto front of these accepted runs contains a set of non-dominated solutions that could be analyzed by the decision maker for his *a posteriori* decision.

## WORKSHOP SESSION: Workshop papers: workshop evolutionary computation in health care and nursing system

### Framework for planning the training sessions in triathlon

In recent years, planning sport training sessions with computational intelligence have been studied by many authors. Most of the algorithms were used for proposing basic and advanced training plans for athletes. In a nutshell, most of the solutions focused on the individual sports disciplines, such as, for example, cycling and running. To the knowledge of the authors, few efforts were invested into planning sports training sessions in triathlon. Triathlon is considered as a popular multi-disciplinary sport consisting of three different sport disciplines. Therefore, planning the triathlon training sessions is much harder than the planning in individual sport disciplines. In this paper, we propose an initial framework for planning triathlon training sessions using Particle Swarm Optimization. Preliminary results are also shown.

### Development of an evaluation system for upper limb function using AR technology

This paper develops a prototype system for evaluating upper limb function that combines Internet of Things (IoT) and Augmented Reality (AR) technology. IoT builds the network of patients, test environment and doctor's surgery from which the system gathers and exchanges data such as the speeds and locations of hand movement. With the help of AR technology, the real-world test environment is overlaid with the information (e.g. the instructions from doctors, the progress of evaluation) gathered from the IoT. Compared to the conventional system, the detailed information of hand movement supports further assessments and the instructions generated in the AR scene for patients relieve the burden of doctors. The control experiments were conducted to explore the effects of the object size, the existence of obstacles and the hand dominance on the upper limb function using the developed system. The results verified the validity of the developed system.

### CATARO: a robot that tells caregivers a patient's current non-critical condition indirectly

Currently, a system is being developed that gathers information about a dementia patient's condition and sends it to a smartphone. Through this system, caregivers in a nursing facility could always check on their patients' current condition and act accordingly. However, they are generally too busy to constantly check their smartphones for "non-critical" information about patients. This research proposes putting a small robot in front of a patient to speak to the patient in a professional manner in reaction to his/her current condition. Caregivers nearby could hear the robot and then determine the patient's condition. When the caregiver is free, he/she could approach the patient and start talking to him/her in the same manner as the robot just did^{1}.

### Sustainable sensor network architecture for monitoring human activities

In this paper, focusing on a sensor network for monitoring human activities, the suitable architecture is discussed to obtain sustain-ability. After briefly showing a general architecture in present, the problems happening with long time usage are described. Then, our ideas are suggested to solve these problems.

### Can evolutionary computing be applied to dementia care?

In dementia care, caregivers need to respond adequately and in a timely manner to their recipients' needs, which is a challenging problem. As it is difficult to measure the effect of responses with strongly evidenced methods, computational simulation may be a useful means of evaluating the results. This article introduces conceivable cases that may be candidates for simulation.

### Envy based fairness in hedonic games

Hedonic games are coalition formation games where agents have hedonic preferences for coalition structures. The main focus of hedonic games has been on notion of stability. In this paper, however, we consider envy based fairness in hedonic games. We investigate emptiness of envy-free coalition structures and summarize the relationship with core stability.

### Classifier generalization for comprehensive classifiers subsumption in XCS

We proposed XCS-VRc^{3} that can extract useful rules (classifiers) from data and verify its effectiveness. The difficulty of mining real world data is that not only the type of the input state but also the number of instances varies. Although conventional method XCS-VRc is able to extract classifiers, the generalization of classifiers was insufficient and lack of human readability. The proposed XCS-VRc^{3} incorporating "generalization mechanism by comprehensive classifier subsumption" to solves this problem. Specifically, (1) All classifiers of the matching set subsume other classifiers, (2) Abolition of the inappropriate classifier deletion introduced by XCS-VRc (3) Preferentially select classifier with small variance of output in genetic algorithm. To verify the effectiveness of XCS-VRc^{3}, we applied on care plan planning problem in a nursing home (in this case, identifying daytime behavior contributing to increase the ratio of deep sleep time). Comparing the association rules obtained by Apriori, and classifiers obtained by XCS-VRc^{3}, the followings was found. First, abolishing the inappropriate classifier deletion and comprehensively subsuming promotes various degrees of generalization. Second, parent selection mechanism can obtain classifiers with small output variance. Finally, XCS-VRc^{3} is able to extract a small number classifiers equivalent to large number of rules found in Apriori.

## WORKSHOP SESSION: Workshop papers: workshop evolutionary algorithms for big data and massively complex problems

### Multi-objective feature selection for EEG classification with multi-level parallelism on heterogeneous CPU-GPU clusters

The present trend in the development of computer architectures that offer improvements in both performance and energy efficiency has provided clusters with interconnected nodes including multiple multi-core microprocessors and accelerators. In these so-called heterogeneous computers, the applications can take advantage of different parallelism levels according to the characteristics of the architectures in the platform. Thus, the applications should be properly programmed to reach good efficiencies, not only with respect to the achieved speedups but also taking into account the issues related to energy consumption. In this paper we provide a multi-objective evolutionary algorithm for feature selection in electroencephalogram (EEG) classification, which can take advantage of parallelism at multiple levels: among the CPU-GPU nodes interconnected in the cluster (through message-passing), and inside these nodes (through shared-memory thread-level parallelism in the CPU cores, and data-level parallelism and thread-level parallelism in the GPU). The procedure has been experimentally evaluated in performance and energy consumption and shows statistically significant benefits for feature selection: speedups of up to 73 requiring only a 6% of the energy consumed by the sequential code.

### Mapping evolutionary algorithms to a reactive, stateless architecture: using a modern concurrent language

Genetic algorithms (GA) [8] are currently one of the most widely used meta-heuristics to solve engineering problems. Furthermore, parallel genetic algorithms (pGAs) are useful to find solutions of complex optimizations problems in adequate times [16]; in particular, problems with complex fitness. Some authors [1] state that using pGAs improves the quality of solutions in terms of the number of evaluations needed to find one. This reason, together with the improvement in evaluation time brought by the simultaneous running in several nodes, have made parallel and distributed evolutionary algorithms a popular methodology.

## WORKSHOP SESSION: Student workshop papers: workshop students

### A comparison of semantic-based initialization methods for genetic programming

During the initialization step, a genetic programming (GP) system traditionally creates a population of completely random programs to populate the initial population. These programs almost always perform poorly in terms of their total error---some might not even output the correct data type. In this paper, we present new methods for initialization that attempt to generate programs that are somewhat relevant to the problem being solved and/or increase the initial diversity (both error and behavioral diversity) of the population prior to the GP run. By seeding the population---and thereby eliminating worthless programs and increasing the initial diversity of the population---we hope to improve the performance of the GP system. Here, we present two novel techniques for initialization (Lexicase Seeding and Pareto Seeding) and compare them to a previous method (Enforced Diverse Populations) and traditional, non-seeded initialization. Surprisingly, we found that none of the initialization methods result in significant differences in problem-solving performance or population diversity across five program synthesis benchmark problems. We attribute this lack of difference to our use of lexicase selection, which seems to rapidly converge on similar populations regardless of initialization method.

### A multi-objective optimization design framework for ensemble generation

Machine learning algorithms have found to be useful for the solution of complex engineering problems. However, due to problem's characteristics, such as class imbalance, classical methods may not be formidable. The authors believe that the application of multi-objective optimization design can improve the results of machine learning algorithms on such scenarios. Thus, this paper proposes a novel methodology for the creation of ensembles of classifiers. To do so, a multi-objective optimization design approach composed of two steps is used. The first step focus on generating a set of diverse classifiers, while the second step focus on the selection of such classifiers as ensemble members. The proposed method is tested on a real-world competition data set, using both decision trees and logistic regression classifiers. Results show that the ensembles created with such technique outperform the best ensemble members.

### Runtime analysis of a population-based evolutionary algorithm with auxiliary objectives selected by reinforcement learning

We propose the method of selection of auxiliary objectives (2 + 2*λ*)-EA+RL which is the population-based modification of the EA+RL method. We analyse the efficiency of this method on the problem X*0*(*n*^{2}) fitness evaluations in expectation, while the initial EA+RL, which is not population-based, yields at least Ω (*n*^{k−1}) fitness evaluations, where *k* is the plateau size. We also prove that in the case of presence of an obstructive auxiliary objective the expected runtime increases only by a constant factor.

### Towards a more general many-objective evolutionary optimizer using multi-indicator density estimation

Recently, it was shown that Many-Objective Evolutionary Algorithms (MaOEAs) that employ a set of convex weight vectors are overspecialized in solving certain benchmark problems. This over-specialization is due to a high correlation between the Pareto fronts of the test problems and the simplex formed by the weight vectors. In furtherance of avoiding this issue, we propose a novel steady-state MaOEA that does not require weight vectors and adaptively chooses between two density estimators: one based on the IGD^{+} indicador that strengthens convergence to the Pareto front and another one, based on the s-energy indicator, which improves the diversity of the solutions. This approach, called sIGD^{+}-MOEA, is compared with respect to NSGA-III, MOEA/D, IGD^{+}-EMOA and MOMBI2 (which are MaOEAs that employ convex weight vectors) on the test suites WFG and WFG^{-1}, using the hypervolume indicator. Experimental results show that sIGD^{+}-MOEA is a promising alternative that can solve many-objective optimization problems whose Pareto fronts present different geometries.

### Analysis of evolutionary multi-tasking as an island model

Recently, an idea of evolutionary multi-tasking has been proposed and applied to various types of optimization problems. The basic idea of evolutionary multi-tasking is to simultaneously solve multiple optimization problems (i.e., tasks) in a cooperative manner by a single run of an evolutionary algorithm. For this purpose, each individual in a population has its own task. This means that a population of individuals can be viewed as being divided into multiple sub-populations. The number of sub-populations is the same as the number of tasks to be solved. In this paper, first we explain that a multi-factorial evolutionary algorithm (MFEA), which is a representative algorithm of evolutionary multi-tasking, can be viewed as a special island model. MFEA has the following two features: (i) Crossover is performed not only within an island but also between islands, and (ii) no migration is performed between islands. Information of individuals in one island is transferred to another island through inter-island crossover. Next, we propose a simple implementation of evolutionary multi-tasking in the framework of the standard island model. Then, we compare our island model with MFEA through computational experiments. Promising results are obtained by our implementation of evolutionary multi-tasking.

### Incorporation of a decision space diversity maintenance mechanism into MOEA/D for multi-modal multi-objective optimization

A Multi-objective optimization problem with several different Pareto optimal solution sets is defined as a multi-modal multi-objective optimization problem. Finding all the Pareto optimal solution sets for this type of problem can provide more options for the decision maker, which is important in some real-world situations. The Multi-objective evolutionary algorithm based on decomposition (MOEA/D) has been proved to perform well in various multi-objective problems but it does not perform well in finding all the Pareto optimal solution sets for multi-modal multi-objective optimization problems. In this paper, a MOEA/D variant is proposed to solve these problems. K solutions are assigned to each weight vector in the MOEA/D variant and the solutions are evaluated by not only the scalarizing function values but also the minimum distance from other solutions with the same weight vector and the average distance from the neighboring solutions in the same weight vector grid. Experimental results show that the MOEA/D variant performs much better than the original MOEA/D on the multi-modal distance minimization problems.

### From fitness landscape analysis to designing evolutionary algorithms: the case study in automatic generation of function block applications

Search-based software engineering, a discipline that often requires finding optimal solutions, can be a viable source for problems that bridge theory and practice of evolutionary computation. In this research we consider one such problem: generation of data connections in a distributed control application designed according to the IEC 61499 industry standard.

We perform the analysis of the fitness landscape of this problem and find why exactly the simplistic (1 + 1) evolutionary algorithm is slower than expected when finding an optimal solution to this problem. To counteract, we develop a population-based algorithm that explicitly maximises diversity among the individuals in the population. We show that this measure indeed helps to improve the running times.

### Weight vector grid with new archive update mechanism for multi-objective optimization

Currently, most of the decomposition-based multi-objective evolutionary algorithms (MOEA) are based on a number of prespecified weight vectors. However, when the shape of the Pareto front is inconsistent with the distribution of weight vectors, only a small number of non-dominated solutions can be obtained inside the Pareto front. Moreover, if an external archive with a dominance-based update mechanism is used to overcome this difficulty, a large computational time is needed which is often unpractical. In this paper, we propose a new archive update mechanism with a new archive structure. A large weight vector grid is used to update the archive by using a scalarizing function. The proposed archive update mechanism can be applied to any MOEA with an external archive. We examine the effectiveness of the proposed mechanism on MOEA/D. Our experimental results show that MOEA/D with the proposed new archive update mechanism is able to find more solutions inside the Pareto front compared to MOEA/D without the archive. In addition, it needs less computational time compared to MOEA/D with the dominance-based archive update mechanism.

### Improved efficiency of MOPSO with adaptive inertia weight and dynamic search space

In this paper, a new Multi-Objective Particle Swarm optimization algorithm (MOPSO) with adaptive inertia weight and dynamic search space is introduced for multi-objective optimization. The objective of the study is to investigate an efficient MOPSO to deal with large-scale optimization and multi-modal problems. The new adaptive inertia weight strategy allows the inertia weight to keep varying throughout the algorithm process, which helps the algorithm to escape from local optima. The dynamic search space design can avoid decision variables from continuously taking their extreme values, and therefore enhances the searching efficiency. The performance of the proposed algorithm was compared with three popular multi-objective algorithms in solving seven benchmark test functions. Results show that the new algorithm can produce reasonably good approximations of the Pareto front, while performing with a budget of 10,000 fitness function evaluations.

### Specialization and elitism in lexicase and tournament selection

Prior work has demonstrated that genetic programming systems often maintain higher levels of population diversity when using lexicase selection than when using other parent selection methods, and that the use of lexicase selection improves problem-solving performance in many circumstances. It has been suggested that it is not only the maintenance of diversity that is responsible for the performance of lexicase selection, but more specifically the production and maintenance of "specialists" that matters, where specialists are defined to be individuals with the lowest error, relative to the rest of the population, on a small number of training cases regardless of total error. Here we provide results of experiments that uphold this suggestion by tracking the numbers of specialists selected by lexicase selection and by tournament selection in a genetic programming system solving software synthesis problems. Our results also show that lexicase selection selects parents with poor total error far more frequently than tournament selection, even near the ends of successful runs, suggesting that such selections are integral to the improved problem-solving performance of lexicase selection.

### Diploidy for evolving neural networks

Genetic algorithms and artificial neural networks are two widely-used techniques that can be combined with each other to produce evolved neural networks. Some research has also looked at the use of diploidy in genetic algorithms for possible advantages over the haploid genetic representation usually used, most notably in the form of better adaptation to changing environments. This paper proposes a diploid representation for evolving neural networks, used in an agent-based simulation. Two versions of the diploid representation were tested with a haploid version in one static and two changing environments. All three genetic types performed differently in different environments.

### Embedded feature selection using probabilistic model-based optimization

In machine learning, feature selection is a commonly used technique for improving the predictive performance and interpretability of a trained model. Feature selection techniques are classified into three approaches: the filter, wrapper, and embedded approaches. The embedded approach performs the feature selection process during the model training and achieves a good balance between performance and computational cost in general. In the paper, we propose a novel embedded feature selection method using probabilistic model-based evolutionary optimization. We introduce the multivariate Bernoulli distribution, which determines the selection of features, and we optimize its parameters during the training. The distribution parameter update rule is the same as that of the population-based incremental learning (PBIL), but we simultaneously update the parameters of the machine learning model using an ordinary gradient descent method. This method can be easily implemented into non-linear models, such as neural networks. Moreover, we incorporate the penalty term into the objective function to control the number of selected feature. We apply the proposed method with the neural network model to the feature selection of three classification problems. The proposed method achieves competitive performance and reasonable computational cost compared with conventional feature selection methods.

### Using a one-class compound classifier to detect in-vehicle network attacks

The Controller Area Network (CAN) in vehicles provides serial communication between electronic control units that manage engine, transmission, steering and braking. Researchers have recently demonstrated the vulnerability of the network to cyber-attacks which can manipulate the operation of the vehicle and compromise its safety. Some proposals for CAN intrusion detection systems, that identify attacks by detecting packet anomalies, have drawn on one-class classification, whereby the system builds a decision surface based on a large number of normal instances. The one-class approach is discussed in this paper, together with initial results and observations from implementing a classifier new to this field. The Compound Classier has been used in image processing and medical analysis, and holds advantages that could be relevant to CAN intrusion detection.