GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation ConferenceFull Citation in the ACM Digital Library
SESSION: Genetic programming
Multi-tree genetic programming for feature construction-based domain adaptation in symbolic regression with incomplete data
Nowadays, transfer learning has gained a rapid popularity in tasks with limited data available. While traditional learning limits the learning process to knowledge available in a specific (target) domain, transfer learning can use parts of knowledge extracted from learning in a different (source) domain to help learning in the target domain. This concept is of special importance when there is a lack of knowledge in the target domain. Consequently, since data incompleteness is a serious cause of knowledge shortage in real-world learning tasks, it can be typically addressed using transfer learning. One way to achieve that is feature construction-based domain adaptation. However, although it is considered as a powerful feature construction algorithm, Genetic Programming has not been fully utilized for domain adaptation. In this work, a multi-tree genetic programming method is proposed for feature construction-based domain adaptation. The main idea is to construct a transformation from the source feature space to the target feature space, which maps the source domain close to the target domain. This method is utilized for symbolic regression with missing values. The experimental work shows encouraging potential of the proposed approach when applied to real-world tasks considering different transfer learning scenarios.
In traditional regression analysis, a detailed examination of the residuals can provide an important way of validating the model quality. However, it has not been utilised in genetic programming based symbolic regression. This work aims to fill this gap and propose a new evaluation criterion of minimising the correlation between the residuals of regression models and the independent variables. Based on a recent association detection measure, maximal information coefficient which provides an accurate estimation of the correlation, the new evaluation measure is expected to enhance the generalisation of genetic programming by driving the evolutionary process towards models that are without unnecessary complexity and less likely learning from noise in data. The experiment results show that, compared with standard genetic programming which selects model based on the training error only and two state-of-the-art multiobjective genetic programming methods with mechanisms to prefer models with adequate structures, our new multiobjective genetic programming method minimising both the correlation between residuals and variable, and the training error has a consistently better generalisation performance and evolves simpler models.
Graph representations promise several desirable properties for Genetic Programming (GP); multiple-output programs, natural representations of code reuse and, in many cases, an innate mechanism for neutral drift. Each graph GP technique provides a program representation, genetic operators and overarching evolutionary algorithm. This makes it difficult to identify the individual causes of empirical differences, both between these methods and in comparison to traditional GP. In this work, we empirically study the behavior of Cartesian Genetic Programming (CGP), Linear Genetic Programming (LGP), Evolving Graphs by Graph Programming (EGGP) and traditional GP. By fixing some aspects of the configurations, we study the performance of each graph GP method and GP in combination with three different EAs: generational, steady-state and (1 + λ). In general, we find that the best choice of representation, genetic operator and evolutionary algorithm depends on the problem domain. Further, we find that graph GP methods, particularly in combination with the (1 + λ) EA are significantly better on digital circuit synthesis tasks.
Semantically-oriented mutation operator in cartesian genetic programming for evolutionary circuit design
Despite many successful applications, Cartesian Genetic Programming (CGP) suffers from limited scalability, especially when used for evolutionary circuit design. Considering the multiplier design problem, for example, the 5×5-bit multiplier represents the most complex circuit evolved from a randomly generated initial population. The efficiency of CGP highly depends on the performance of the point mutation operator, however, this operator is purely stochastic. This contrasts with the recent developments in Genetic Programming (GP), where advanced informed approaches such as semantic-aware operators are incorporated to improve the search space exploration capability of GP. In this paper, we propose a semantically-oriented mutation operator (SOMO) suitable for the evolutionary design of combinational circuits. SOMO uses semantics to determine the best value for each mutated gene. Compared to the common CGP and its variants as well as the recent versions of Semantic GP, the proposed method converges on common Boolean benchmarks substantially faster while keeping the phenotype size relatively small. The successfully evolved instances presented in this paper include 10-bit parity, 10+10-bit adder and 5×5-bit multiplier. The most complex circuits were evolved in less than one hour with a single-thread implementation running on a common CPU.
Tangled Program Graphs (TPG) is a framework for genetic programming which has shown promise in challenging reinforcement learning problems with discrete action spaces. The approach has recently been extended to incorporate temporal memory mechanisms that enable operation in environments with partial-observability at multiple timescales. Here we propose a highly-modular memory structure that manages temporal properties of a task and enables operation in problems with continuous action spaces. This significantly broadens the scope of real-world applications for TPGs, from continuous-action reinforcement learning to time series forecasting. We begin by testing the new algorithm on a suite of symbolic regression benchmarks. Next, we evaluate the method in 3 challenging time series forecasting problems. Results generally match the quality of state-of-the-art solutions in both domains. In the case of time series prediction, we show that temporal memory eliminates the need to pre-specify a fixed-size sliding window of previous values, or autoregressive state, which is used by all compared methods. This is significant because it implies that no prior model for a time series is necessary, and the forecaster may adapt more easily if the properties of a series change significantly over time.
In symbolic regression, the search for analytic models is typically driven purely by the prediction error observed on the training data samples. However, when the data samples do not sufficiently cover the input space, the prediction error does not provide sufficient guidance toward desired models. Standard symbolic regression techniques then yield models that are partially incorrect, for instance, in terms of their steady-state characteristics or local behavior. If these properties were considered already during the search process, more accurate and relevant models could be produced. We propose a multi-objective symbolic regression approach that is driven by both the training data and the prior knowledge of the properties the desired model should manifest. The properties given in the form of formal constraints are internally represented by a set of discrete data samples on which candidate models are exactly checked. The proposed approach was experimentally evaluated on three test problems with results clearly demonstrating its capability to evolve realistic models that fit the training data well while complying with the prior knowledge of the desired model characteristics at the same time. It outperforms standard symbolic regression by several orders of magnitude in terms of the mean squared deviation from a reference model.
Society has come to rely on algorithms like classifiers for important decision making, giving rise to the need for ethical guarantees such as fairness. Fairness is typically defined by asking that some statistic of a classifier be approximately equal over protected groups within a population. In this paper, current approaches to fairness are discussed and used to motivate algorithmic proposals that incorporate fairness into genetic programming for classification. We propose two ideas. The first is to incorporate a fairness objective into multi-objective optimization. The second is to adapt lexicase selection to define cases dynamically over intersections of protected groups. We describe why lexicase selection is well suited to pressure models to perform well across the potentially infinitely many subgroups over which fairness is desired. We use a recent genetic programming approach to construct models on four datasets for which fairness constraints are necessary, and empirically compare performance to prior methods utilizing game-theoretic solutions. Methods are assessed based on their ability to generate trade-offs of subgroup fairness and accuracy that are Pareto optimal. The result show that genetic programming methods in general, and random search in particular, are well suited to this task.
Machine Learning (ML) has now become an important and ubiquitous tool in science and engineering, with successful applications in many real-world domains. However, there are still areas in need of improvement, and problems that are still considered difficult with off-the-shelf methods. One such problem is Multi Target Regression (MTR), where the target variable is a multidimensional tuple instead of a scalar value. In this work, we propose a more difficult variant of this problem which we call Unlabeled MTR (uMTR), where the structure of the target space is not given as part of the training data. This version of the problem lies at the intersection of MTR and clustering, an unexplored problem type. Moreover, this work proposes a solution method for uMTR, a hybrid algorithm based on Genetic Programming and RANdom SAmple Consensus (RANSAC). Using a set of benchmark problems, we are able to show that this approach can effectively solve the uMTR problem.
Tasks related to Natural Language Processing (NLP) have recently been the focus of a large research endeavor by the machine learning community. The increased interest in this area is mainly due to the success of deep learning methods. Genetic Programming (GP), however, was not under the spotlight with respect to NLP tasks. Here, we propose a first proof-of-concept that combines GP with the well established NLP tool word2vec for the next word prediction task. The main idea is that, once words have been moved into a vector space, traditional GP operators can successfully work on vectors, thus producing meaningful words as the output. To assess the suitability of this approach, we perform an experimental evaluation on a set of existing newspaper headlines. Individuals resulting from this (pre-)training phase can be employed as the initial population in other NLP tasks, like sentence generation, which will be the focus of future investigations, possibly employing adversarial co-evolutionary approaches.
In recent years the field of genetic programming has made significant advances towards automatic programming. Research and development of contemporary program synthesis methods, such as PushGP and Grammar Guided Genetic Programming, can produce programs that solve problems typically assigned in introductory academic settings. These problems focus on a narrow, predetermined set of simple data structures, basic control flow patterns, and primitive, non-overlapping data types (without, for example, inheritance or composite types). Few, if any, genetic programming methods for program synthesis have convincingly demonstrated the capability of synthesizing programs that use arbitrary data types, data structures, and specifications that are drawn from existing codebases. In this paper, we introduce Code Building Genetic Programming (CBGP) as a framework within which this can be done, by leveraging programming language features such as reflection and first-class specifications. CBGP produces a computational graph that can be executed or translated into source code of a host language. To demonstrate the novel capabilities of CBGP, we present results on new benchmarks that use non-primitive, polymorphic data types as well as some standard program synthesis benchmarks.
Genetic Programming for Symbolic Regression is often prone to overfit the training data, resulting in poor generalization on unseen data. To address this issue, many pieces of research have been devoted to regularization via controlling the model complexity. However, due to the unstructured tree based representation of individuals the model complexity cannot be directly computed, rather approximation of the complexity must be taken. This paper proposes a new novel representation called Adaptive Weighted Splines which enables explicit control over the complexity of individuals using splines. The experimental results confirm that this new representation is significantly better than the tree-based representation at avoiding overfitting and generalizing on unseen data, demonstrating notably better and far more consistent generalization performances on all the benchmark problems. Further analysis also shows that in most cases, the new Genetic Programming method outperforms classical regression techniques such as Linear Regression, Support Vector Regression, K-Nearest Neighbour and Decision Tree Regression and performs competitively with state-of-the-art ensemble regression methods Random Forests and Gradient Boosting.
For many linear and nonlinear systems that arise from the discretization of partial differential equations the construction of an efficient multigrid solver is a challenging task. Here we present a novel approach for the optimization of geometric multigrid methods that is based on evolutionary computation, a generic program optimization technique inspired by the principle of natural evolution. A multigrid solver is represented as a tree of mathematical expressions which we generate based on a tailored grammar. The quality of each solver is evaluated in terms of convergence and compute performance using automated local Fourier analysis (LFA) and roofline performance modeling, respectively. Based on these objectives a multi-objective optimization is performed using grammar-guided genetic programming with a non-dominated sorting based selection. To evaluate the model-based prediction and to target concrete applications, scalable implementations of an evolved solver can be automatically generated with the ExaStencils framework. We demonstrate the effectiveness of our approach by constructing multigrid solvers for a linear elastic boundary value problem that are competitive with common V- and W-cycles.
Genetic Improvement (GI) focuses on the development of evolutionary methods to automate software engineering tasks, such as performance improvement or software bugs removal. Concerning the latter, one of the earliest and most well-known methods in this area is the Genetic Program Repair (GenProg), a variant of Genetic Programming (GP). However, most GI systems encounter problems that are derived from the fact that they operate directly at source code level. These problems include highly neutral fitness landscapes and loss of diversity during the search, which are always undesirable in search and optimization tasks. This paper explores the use of Novelty Search (NS) with GenProg, since it can allow a search process to overcome these type of issues. While NS has been combined with GP before, and recently used with other GI systems, in the area of automatic bug repair NS has not been used until this work. Results show that GenProg with NS outperforms the original algorithm in some cases, based on an extensive experimental evaluation.
We present a new method, Synthesis through Unification Genetic Programming (STUN GP), which synthesizes provably correct programs using a Divide and Conquer approach. This method first splits the input space by undergoing a discovery phase that uses Counterexample-Driven Genetic Programming (CDGP) to identify a set of programs that are provably correct under unknown unification constraints. The STUN GP method then computes these restraints by synthesizing predicates with CDGP that strictly map inputs to programs where the output will be correct.
This method builds on previous work towards applying Genetic Programming (GP) to Syntax Guided Synthesis (SyGus) problems that seek to synthesize programs adhering to a formal specification rather than a fixed set of input-output examples. We show that our method is more scalable than previous CDGP variants, solving several benchmarks from the SyGus Competition that cannot be solved by CDGP. STUN GP significantly cuts into the gap between GP and state-of-the-art SyGus solvers and further demonstrates Genetic Programming's potential for Program Synthesis.
DAE-GP: denoising autoencoder LSTM networks as probabilistic models in estimation of distribution genetic programming
Estimation of distribution genetic programming (EDA-GP) algorithms are metaheuristics where sampling new solutions from a learned probabilistic model replaces the standard mutation and recombination operators of genetic programming (GP). This paper presents DAE-GP, a new EDA-GP which uses denoising autoencoder long short-term memory networks (DAE-LSTMs) as probabilistic model. DAE-LSTMs are artificial neural networks that first learn the properties of a parent population by mapping promising candidate solutions to a latent space and reconstructing the candidate solutions from the latent space. The trained model is then used to sample new offspring solutions. We show on a generalization of the royal tree problem that DAE-GP outperforms standard GP and that performance differences increase with higher problem complexity. Furthermore, DAE-GP is able to create offspring with higher fitness from a learned model in comparison to standard GP. We believe that the key reason for the high performance of DAE-GP is that we do not impose any assumptions about the relationships between learned variables which is different to previous EDA-GP models. Instead, DAE-GP flexibly identifies and models relevant dependencies of promising candidate solutions.