Skip to main content
Lee Spector

    Lee Spector

    Research Interests:
    Lexicase parent selection filters the population by considering one random training case at a time, eliminating any individuals with errors for the current case that are worse than the best error in the selection pool, until a single... more
    Lexicase parent selection filters the population by considering one random training case at a time, eliminating any individuals with errors for the current case that are worse than the best error in the selection pool, until a single individual remains. This process often stops before considering all training cases, meaning that it will ignore the error values on any cases that were not yet considered. Lexicase selection can therefore select specialist individuals that have poor errors on some training cases, if they have great errors on others and those errors come near the start of the random list of cases used for the parent selection event in question. We hypothesize here that selecting these specialists, which may have poor total error, plays an important role in lexicase selection's observed performance advantages over error-aggregating parent selection methods such as tournament selection, which select specialists much less frequently. We conduct experiments examining thi...
    Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve... more
    Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve difficult problems in program synthesis and symbolic regression, the central goal of this paper is to develop the theoretical underpinnings that explain its performance. To this end, we derive an analytical formula that gives the expected probabilities of selection under lexicase selection, given a population and its behavior. In addition, we expand upon the relation of lexicase selection to many-objective optimization methods to describe the behavior of lexicase selection, which is to select individuals on the boundaries of Pareto fronts in high-dimensional space. We show analytically why lexicase selection performs more poorly for certain sizes of population and training cases, and show why it has been shown to perform more poorly in continuous error ...
    We explore tradeo#s between classical communication and entanglement-generating powers of unitary 2-qubit gates. The exploration is aided by a computational search technique called genetic programming.
    Cooperation in evolving populations of agents has been explained as arising from kin selection, reciprocity during repeated interactions, and indirect reciprocity through agent reputations. All of these mechanisms require signican t agent... more
    Cooperation in evolving populations of agents has been explained as arising from kin selection, reciprocity during repeated interactions, and indirect reciprocity through agent reputations. All of these mechanisms require signican t agent capabilities, but recent research using computational models has shown that arbitrary markers called \tags" can be used to achieve signican t levels of cooperation even in the absence of memory, repeated interactions or knowledge of kin. This is important because it helps to explain the evolution of cooperation in organisms with limited cognitive capabilities, and also because it may help us to engineer cooperative behaviors in multi-agent systems. The computational models used in previous studies, however, have typically been constrained such that cooperation is the only viable strategy for gaining an evolutionary advantage. Here we show that tagmediated recognition can lead to signican t levels of cooperation in a less constrained articial l...
    This paper discusses the role of culture in the evolution of cognitive systems. We define “culture” as any information transmitted between individuals and between generations by nongenetic means. Experiments are presented that use genetic... more
    This paper discusses the role of culture in the evolution of cognitive systems. We define “culture” as any information transmitted between individuals and between generations by nongenetic means. Experiments are presented that use genetic programming systems that include special mechanisms for cultural transmission of information. These systems evolve computer programs that perform cognitive tasks including mathematical function mapping and action selection in a virtual world. The data show that the presence of culture-supporting mechanisms can have a clear beneficial impact on the evolvability of correct programs. The implications that these results may have for cognitive science are briefly discussed.
    We report how breve, a simulation environment with rich 3d graphics, was used to discover significant patterns in the dynamics of a system that evolves controllers for swarms of goal-directed agents. These patterns were discovered via... more
    We report how breve, a simulation environment with rich 3d graphics, was used to discover significant patterns in the dynamics of a system that evolves controllers for swarms of goal-directed agents. These patterns were discovered via visualization in the sense that we had not considered their relevance or thought to look for them initially, but they became obvious upon visually observing the behavior of the system. In this paper we briefly describe breve and the system of evolving swarms that we implemented within it. We then describe two discovered properties of the evolutionary dynamics of the system: transitions to/from genetic drift regimes and the emergence of collective or multicellular organization. We comment more generally on the utility of 3d visualization for the discovery of biologically significant phenomena and briefly describe our ongoing work in this area. Pointers are provided to on-line resources including source code and animations that demonstrate several of the...
    This paper discusses the evolution of diversifying reproduction. We measured the average difference between mothers and their children, the number of species, and the degree of adaptation in evolving populations of endogenously... more
    This paper discusses the evolution of diversifying reproduction. We measured the average difference between mothers and their children, the number of species, and the degree of adaptation in evolving populations of endogenously diversifying digital organisms using the Pushpop system. The data show that the number of species in adaptive populations is higher than in nonadaptive populations, while the variance in the differences between mothers and their children is less for adaptive populations than for non-adaptive populations. In other words, in adaptive populations the species were more numerous and the diversification processes were more reliable.
    Genetic programming can be used to automatically discover algorithms for quantum computers that are more efficient than any classical computer algorithms for the same problems. In this paper we exhibit the first evolved... more
    Genetic programming can be used to automatically discover algorithms for quantum computers that are more efficient than any classical computer algorithms for the same problems. In this paper we exhibit the first evolved betterthan-classical quantum algorithm, for Deutsch’s “early promise” problem. We also demonstrate a technique for evolving scalable quantum gate arrays and discuss other issues in the application of genetic programming to quantum computation and vice versa. 1. Quantum Computing Quantum computers are computational devices that use atomic-scale objects, for example 2-state particles, to store and manipulate information (Steane, 1997; for an elementary on-line tutorial see Braunstein, 1995; for an introduction for the general reader see Milburn, 1997). The physics of these devices allows them to do things that common digital (henceforth “classical”) computers cannot. Although quantum computers and classical computers appear to be bound by the same limits of Turing comp...
    The growth of program size during evolution (code "bloat") is a well-documented and well-studied problem in genetic programming. This paper examines the use of "size fair" genetic operators to combat code bloat in the... more
    The growth of program size during evolution (code "bloat") is a well-documented and well-studied problem in genetic programming. This paper examines the use of "size fair" genetic operators to combat code bloat in the PushGP genetic programming system. Size fair operators are compared to naive operators and to operators that use "node selection" as described by Koza. The effects of the operator choices are assessed in runs on symbolic regression, parity and multiplexor problems (2,700 runs in total). The results show that the size fair operators control bloat well while producing unusually parsimonious solutions. The computational effort required to find a solution using size fair operators is about equal to, or slightly better than, the effort required using the comparison operators.
    This paper shows how ontogenetic programming , an enhancement to the genetic programming methodology, a l l o ws for the automatic generation of adaptive programs. Programs produced by o n togenetic programming may include calls to... more
    This paper shows how ontogenetic programming , an enhancement to the genetic programming methodology, a l l o ws for the automatic generation of adaptive programs. Programs produced by o n togenetic programming may include calls to self-modiication operators. By permitting runtime program self-modiication, these operators allow e v olved programs to further adapt to their environments. In this paper the onto-genetic programming methodology is described and two examples of its use are presented, one for binary sequence prediction and the other for action selection in a virtual world. In both cases the inclusion of self-modiication operators has a clear positive impact on the ability of genetic programming to produce successful programs.
    This work applies PushGP (a multi-type, automatically modularizing genetic programming system) to the 3D Opera problem (a cooperation and navigation multi-agent task involving the movement of a 3D swarm of agents through a constrained... more
    This work applies PushGP (a multi-type, automatically modularizing genetic programming system) to the 3D Opera problem (a cooperation and navigation multi-agent task involving the movement of a 3D swarm of agents through a constrained exit point). Within this framework we explore the effect of adding task-specific data types to the GP system. In particular, we extend the native types of PushGP to include 3D vectors, and we compare the results with and without this extension to each other and to human-programmed agent controllers.
    Social foraging shows unexpected features such as the existence of a group size threshold to accomplish a successful hunt. Above this threshold, additional individuals do not increase the probability of capturing the prey. Recent direct... more
    Social foraging shows unexpected features such as the existence of a group size threshold to accomplish a successful hunt. Above this threshold, additional individuals do not increase the probability of capturing the prey. Recent direct observations of wolves (Canis lupus) in Yellowstone Park show that the group size threshold when hunting its most formidable prey, bison (Bison bison), is nearly three times greater than when hunting elk (Cervus elaphus), a prey that is considerably less challenging to capture than bison. These observations provide empirical support to a computational particle model of group hunting which was previously shown to be effective in explaining why hunting success peaks at apparently small pack sizes when hunting elk. The model is based on considering two critical distances between wolves and prey: the minimal safe distance at which wolves stand from the prey, and the avoidance distance at which wolves move away from each other when they approach the prey....
    Koza has previously shown that the power of a genetic programming system can often be enhanced by allowing for the simultaneous evolution of a main program and a collection of automatically deened functions ADFs. In this paper I show h o... more
    Koza has previously shown that the power of a genetic programming system can often be enhanced by allowing for the simultaneous evolution of a main program and a collection of automatically deened functions ADFs. In this paper I show h o w related techniques can be used to simultaneously evolve a collection of automatically deened macros ADMs. I show how ADMs can be used to produce new control structures during the evolution of a program, and I present data showing that ADMs sometimes provide a greater beneet than do ADFs. I discuss the characteristics of problems that may beneet most from the use of ADMs, or from architectures that include both ADFs and ADMs, and I discuss directions for further research .
    Autoconstructive evolution is the idea of evolving programs through self-creation. This is an alternative to the hand-coded variation operators utilized in traditional genetic programming (GP) and the deliberately limited implementations... more
    Autoconstructive evolution is the idea of evolving programs through self-creation. This is an alternative to the hand-coded variation operators utilized in traditional genetic programming (GP) and the deliberately limited implementations of meta-GP. In the latter case strategies generally involve adapting the variation operators which are then used in accordance with traditional GP. On the other hand, autoconstruction offers the ability to adapt algorithmic reproductive mechanisms specific to individuals in the evolving population. We study multiple methods of compositional autoconstruc-tion, a form of autoconstruction based on function composition. While much of the previous work on autoconstruction has investigated traditional GP problems, we investigate the effect of autoconstructive evolution on two problems: Order, which models order-sensitive program semantics, and Majority, which models the evolutionary acquisition of semantic components. In doing so we show that compositiona...
    HiGP is a new high-performance genetic programming system. This system combines techniques from string-based genetic algorithms, S-expression-based genetic programming systems, and high-performance parallel computing. The result is a... more
    HiGP is a new high-performance genetic programming system. This system combines techniques from string-based genetic algorithms, S-expression-based genetic programming systems, and high-performance parallel computing. The result is a fast, flexible, and easily portable genetic programming engine with a clear and efficient parallel implementation. HiGP manipulates and produces linear programs for a stack-based virtual machine, rather than the tree-structured S-expressions used in traditional genetic programming. In this paper we describe the HiGP virtual machine and genetic programming algorithms. We demonstrate the system's performance on a symbolic regression problem and show that HiGP can solve this problem with substantially less computational effort than can a traditional genetic programming system. We also show that HiGP's time performance is significantly better than that of a well-written S-expression-based system, also written in C. We further show that our parallel ...
    In genetic programming, an evolutionary method for producing computer programs that solve specified computational problems, parent selection is ordinarily based on aggregate measures of performance across an entire training set. Lexicase... more
    In genetic programming, an evolutionary method for producing computer programs that solve specified computational problems, parent selection is ordinarily based on aggregate measures of performance across an entire training set. Lexicase selection, by contrast, selects on the basis of performance on random sequences of training cases; this has been shown to enhance problem-solving power in many circumstances. Lexicase selection can also be seen as better reflecting biological evolution, by modeling sequences of challenges that organisms face over their lifetimes. Recent work has demonstrated that the advantages of lexicase selection can be amplified by down-sampling, meaning that only a random subsample of the training cases is used each generation. This can be seen as modeling the fact that individual organisms encounter only subsets of the possible environments and that environments change over time. Here we provide the most extensive benchmarking of down-sampled lexicase selectio...
    Analytical results from AI planning research provide the motivation for this experimental study of ordering relationships in human planning. We examine timings of humans performing specific tasks from the AI planning literature and... more
    Analytical results from AI planning research provide the motivation for this experimental study of ordering relationships in human planning. We examine timings of humans performing specific tasks from the AI planning literature and present evidence that normal human planners, like “state of the art” AI planning systems, use partial-order plan representations. We also describe ongoing experiments that are designed to shed light on the plan representations used by children and by adults with planning deficits due to brain damage. Several points of interest for collaboration between AI scientists and neuropsychologists are noted, as are impacts that we feel this research may have on future work in AI planning.
    The performance of a genetic programming system depends partially on the composition of the collection of elements out of which programs can be constructed, and by the relative probability of different instructions and constants being... more
    The performance of a genetic programming system depends partially on the composition of the collection of elements out of which programs can be constructed, and by the relative probability of different instructions and constants being chosen for inclusion in randomly generated programs or for introduction by mutation. In this paper we develop a method for the transfer learning of instruction sets across different software synthesis problems. These instruction sets outperform unlearned instruction sets on a range of problems.
    In many genetic programming systems, the program variation and execution processes operate on different program representations. The representations on which variation operates are referred to as genomes. Unconstrained linear genome... more
    In many genetic programming systems, the program variation and execution processes operate on different program representations. The representations on which variation operates are referred to as genomes. Unconstrained linear genome representations can provide a variety of advantages, including reduced complexity of program generation, variation, simplification and serialization operations. The Plush genome representation, which uses epigenetic markers on linear genomes to express nonlinear structures, has supported the production of state-of-the-art results in program synthesis with the PushGP genetic programming system. Here we present a new, simpler, non-epigenetic alternative to Plush, called Plushy, that appears to maintain all of the advantages of Plush while providing additional benefits. These results illustrate the virtues of unconstrained linear genome representations more generally, and may be transferable to genetic programming systems that target different languages for evolved programs.
    Genetic Programming has advanced the state of the art in the field of software synthesis. However, it has still not been able to produce some of the more complex programs routinely written by humans. One of the heuristics human... more
    Genetic Programming has advanced the state of the art in the field of software synthesis. However, it has still not been able to produce some of the more complex programs routinely written by humans. One of the heuristics human programmers use to build complex software is the organization of code into reusable modules. Ever since the introduction of the concept of Automatically Defined Functions (ADFs) by John Koza in the 1990s, the genetic programming community has also expressed the need to evolve modular programs, but despite this interest and several subsequent innovations, the goal of evolving large-scale software built on reusable modules has not yet been achieved. In this chapter, we first discuss two modularity metrics—Reuse and Repetition—and describe the procedure for calculating them from program code and corresponding execution traces. We then introduce the concept of design features, which can be used alongside error measures to guide evolution. We also demonstrate the use of modularity design features in parent selection.
    The Obscure Features Hypothesis (OFH) for innovation states that a two-step process undergirds almost all innovative solutions: (1) notice an infrequently observed or new (i.e., obscure) feature of the problem and (2) construct an... more
    The Obscure Features Hypothesis (OFH) for innovation states that a two-step process undergirds almost all innovative solutions: (1) notice an infrequently observed or new (i.e., obscure) feature of the problem and (2) construct an interaction involving the obscure feature that produces the desired effects to solve the problem. The OFH leads to a systematic derivation of innovation-enhancing techniques by engaging in two tasks. First, we developed a 32-category system of the types of features possessable by a physical object or material. This Feature Type Taxonomy (FTT) provides a panoramic view of the space of features and assists in searches for the obscure ones. Second, we are articulating the many cognitive reasons that obscure features are overlooked and are developing countering techniques for each known reason. We present the implications and techniques of the OFH, as well as indicate how software can assist innovators in the effective use of these innovation-enhancing techniq...
    Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve... more
    Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve difficult problems in program synthesis and symbolic regression, the central goal of this paper is to develop the theoretical underpinnings that explain its performance. To this end, we derive an analytical formula that gives the expected probabilities of selection under lexicase selection, given a population and its behavior. In addition, we expand upon the relation of lexicase selection to many-objective optimization methods to describe the behavior of lexicase selection, which is to select individuals on the boundaries of Pareto fronts in high-dimensional space. We show analytically why lexicase selection performs more poorly for certain sizes of population and training cases, and show why it has been shown to perform more poorly in continuous error ...

    And 173 more