Abstract Side-effects are widely believed to impede program comprehension and have a detrimental ... more Abstract Side-effects are widely believed to impede program comprehension and have a detrimental effect upon software maintenance. This paper introduces an algorithm for side-effect removal which splits the side-effects into their pure expression meaning and their state-changing meaning. Symbolic execution is used to determine the expression meaning, while transformation is used to place the state-changing part in a suitable location in a transformed version of the program.
Abstract This paper introduces the concept of test suite latency. The more latent a test suite, t... more Abstract This paper introduces the concept of test suite latency. The more latent a test suite, the more it is possible to repeatedly select subsets that achieve a test goal (such as coverage) without re-applying test cases. Where a test case is re-applied it cannot reveal new information. The more a test suite is forced to re-apply already applied test cases in order to achieve the test goal, the more it has becomeworn out'. Test suite latency is the flipside of wear out; the more latent a test suite, the less prone it is to wear out.
It is said 90% of faults that survive manufacturer's testing procedures are complex. That is, the... more It is said 90% of faults that survive manufacturer's testing procedures are complex. That is, the corresponding bug fix contains multiple changes. Higher order mutation testing is used to study defect interactions and their impact on software testing for fault finding. We adopt a multi-objective Pareto optimal approach using Monte Carlo sampling, genetic algorithms and genetic programming to search for higher order mutants which are both hard-to-kill and realistic.
SUMMARY Service-oriented architecture (SOA) is gaining momentum as an emerging distributed system... more SUMMARY Service-oriented architecture (SOA) is gaining momentum as an emerging distributed system architecture for business-to-business collaborations. This momentum can be observed in both industry and academic research. SOA presents new challenges and opportunities for testing and verification, leading to an upsurge in research.
Abstract Requirements engineering for multiple customers, each of whom have competing and often c... more Abstract Requirements engineering for multiple customers, each of whom have competing and often conflicting priorities, raises issues of negotiation, mediation and conflict resolution. This paper uses a multi-objective optimisation approach to support investigation of the trade-offs in various notions of fairness between multiple customers. Results are presented to validate the approach using two real-world data sets and also using data sets created specifically to stress test the approach.
Abstract. The omnipresence of software graphs as useful intermediate representations means that t... more Abstract. The omnipresence of software graphs as useful intermediate representations means that the identification of near-match subgraphs (Error-Correcting Subgraph Isomorphism) has diverse and widespread applications in software engineering, such as querying, clone detection and model checking. Each software engineering subarea has developed specific tailored approaches to subgraph isomorphism, thereby reducing comparability and generality, and potentially yielding sub-optimal results.
ABSTRACT There has recently been a great deal of interest in search–based test data generation, w... more ABSTRACT There has recently been a great deal of interest in search–based test data generation, with many local and global search algorithms being proposed. However, to date, there has been no investigation of the relationship between the size of the input domain (the search space) and performance of the search–based algorithms. Static analysis can be used to remove irrelevant variables for a given test data generation problem, thereby reducing the search space size.
Abstract This paper addresses the question:" How can animated visualisation be used to express in... more Abstract This paper addresses the question:" How can animated visualisation be used to express interesting properties of static analysis?" The particular focus is upon static dependence analysis, but the approach adopted in the paper is applicable to other forms of static analysis. The challenge is twofold. First, there is the inherent difficultly of using animation, which is inherently dynamic, as a representation of static analysis, which is not. The paper shows one way in which this apparent contradiction can be overcome.
A program schema defines a class of programs, all of which have identical statement structures, b... more A program schema defines a class of programs, all of which have identical statement structures, but whose expressions may differ. We prove that given any two structured schemas which are conservative, linear and free, it is decidable whether they are equivalent.
Abstract Optimising programs for non-functional properties such as speed, size, throughput, power... more Abstract Optimising programs for non-functional properties such as speed, size, throughput, power consumption and bandwidth can be demanding; pity the poor programmer who is asked to cater for them all at once! We set out an alternate vision for a new kind of software development environment inspired by recent results from Search Based Software Engineering (SBSE).
Abstract This paper presents techniques to integrate boundary overlap into concept assignment usi... more Abstract This paper presents techniques to integrate boundary overlap into concept assignment using Plausible Reasoning. Heuristic search techniques such as Hill climbing and Genetic Algorithms are investigated. A new fitness measure appropriate for overlapping concept assignment is introduced. The new algorithms are compared to randomly generated results and the Genetic Algorithm is shown to be the best of the proposed search algorithms in terms of the quality of concept binding, as measured by the fitness function.
ABSTRACT Companies such as Google tend to develop products from one continually evolving core of ... more ABSTRACT Companies such as Google tend to develop products from one continually evolving core of code. Software is neither shipped, nor released in the traditional sense. It is simply made available, with dramatically compressed release cycles regression testing. This large scale rapid release environment creates challenges for the application of regression test optimisation techniques.
This paper presents an approach to Search Based Software Project Management based on Cooperative ... more This paper presents an approach to Search Based Software Project Management based on Cooperative Co-evolution. Our approach aims to optimize both developers' team staffing and work package scheduling through cooperative co-evolution to achieve early overall completion time. To evaluate our approach, we conducted an empirical study, using data from four real-world software projects. Results indicate that the Co-evolutionary approach significantly outperforms a single population evolutionary algorithm.
Abstract A dependence cluster is a set of program statements, all of which are mutually inter-dep... more Abstract A dependence cluster is a set of program statements, all of which are mutually inter-dependent. This article reports a large scale empirical study of dependence clusters in C program source code. The study reveals that large dependence clusters are surprisingly commonplace. Most of the 45 programs studied have clusters of dependence that consume more than 10% of the whole program. Some even have clusters consuming 80% or more.
Abstract Pair-wise comparison has been successfully utilised in order to prioritise test cases by... more Abstract Pair-wise comparison has been successfully utilised in order to prioritise test cases by exploiting the rich, valuable and unique knowledge of the tester. However, the prohibitively large cost of the pair-wise comparison method prevents it from being applied to large test suites. In this paper, we introduce a cluster-based test case prioritisation technique. By clustering test cases, based on their dynamic runtime behaviour, we can reduce the required number of pair-wise comparisons significantly.
Abstract Generating realistic test data is a major problem for software testers. Realistic test d... more Abstract Generating realistic test data is a major problem for software testers. Realistic test data generation for certain input types is hard to automate and therefore laborious. We propose a novel automated solution to test data generation that exploits existing web services as sources of realistic test data. Our approach is capable of generating realistic test data and also generating data based on tester-specified constraints.
Abstract A slice is constructed by deleting statements from a program whilst preserving some proj... more Abstract A slice is constructed by deleting statements from a program whilst preserving some projection of its semantics. Since Mark Weiser introduced program slicing in 1979, a wide variety of slicing paradigms have been proposed, each of which is based upon a new formulation of the slicing criterion, capturing the semantic projection to be preserved during the process of command deletion. This paper surveys these slicing criteria, attempting to establish a set of parameters which combine to form a slicing criterion.
Abstract Test case prioritisation techniques aim to maximise the chance of fault detection as ear... more Abstract Test case prioritisation techniques aim to maximise the chance of fault detection as early in testing as possible. This is most commonly achieved by prioritising the tests according to a surrogate measure that is thought to correspond to fault detection capabilities, such as code coverage. However, once the prioritised test suite indeed detects a fault, the original prioritisation may become obsolete.
Abstract This paper introduces an approach to web application regression testing, based upon repa... more Abstract This paper introduces an approach to web application regression testing, based upon repair of user session data. The approach is entirely automated. It consists of a white box examination of the structure of the changed web application to detect changes and a set of techniques to map these detected changes onto repair actions. The paper reports the results of experiments that explore both the performance and effectiveness of the approach.
Abstract Like other engineering disciplines, software engineering is typically concerned with nea... more Abstract Like other engineering disciplines, software engineering is typically concerned with near optimal solutions or those which fall within a specified applicable tolerance. More recently, search-based techniques have started to find application in software engineering problem domains. This area of search-based software engineering has its origins in work on search-based testing, which began in the mid 1990s. Already, search-based solutions have been applied to software engineering problems right through the development life cycle.
Abstract Side-effects are widely believed to impede program comprehension and have a detrimental ... more Abstract Side-effects are widely believed to impede program comprehension and have a detrimental effect upon software maintenance. This paper introduces an algorithm for side-effect removal which splits the side-effects into their pure expression meaning and their state-changing meaning. Symbolic execution is used to determine the expression meaning, while transformation is used to place the state-changing part in a suitable location in a transformed version of the program.
Abstract This paper introduces the concept of test suite latency. The more latent a test suite, t... more Abstract This paper introduces the concept of test suite latency. The more latent a test suite, the more it is possible to repeatedly select subsets that achieve a test goal (such as coverage) without re-applying test cases. Where a test case is re-applied it cannot reveal new information. The more a test suite is forced to re-apply already applied test cases in order to achieve the test goal, the more it has becomeworn out'. Test suite latency is the flipside of wear out; the more latent a test suite, the less prone it is to wear out.
It is said 90% of faults that survive manufacturer's testing procedures are complex. That is, the... more It is said 90% of faults that survive manufacturer's testing procedures are complex. That is, the corresponding bug fix contains multiple changes. Higher order mutation testing is used to study defect interactions and their impact on software testing for fault finding. We adopt a multi-objective Pareto optimal approach using Monte Carlo sampling, genetic algorithms and genetic programming to search for higher order mutants which are both hard-to-kill and realistic.
SUMMARY Service-oriented architecture (SOA) is gaining momentum as an emerging distributed system... more SUMMARY Service-oriented architecture (SOA) is gaining momentum as an emerging distributed system architecture for business-to-business collaborations. This momentum can be observed in both industry and academic research. SOA presents new challenges and opportunities for testing and verification, leading to an upsurge in research.
Abstract Requirements engineering for multiple customers, each of whom have competing and often c... more Abstract Requirements engineering for multiple customers, each of whom have competing and often conflicting priorities, raises issues of negotiation, mediation and conflict resolution. This paper uses a multi-objective optimisation approach to support investigation of the trade-offs in various notions of fairness between multiple customers. Results are presented to validate the approach using two real-world data sets and also using data sets created specifically to stress test the approach.
Abstract. The omnipresence of software graphs as useful intermediate representations means that t... more Abstract. The omnipresence of software graphs as useful intermediate representations means that the identification of near-match subgraphs (Error-Correcting Subgraph Isomorphism) has diverse and widespread applications in software engineering, such as querying, clone detection and model checking. Each software engineering subarea has developed specific tailored approaches to subgraph isomorphism, thereby reducing comparability and generality, and potentially yielding sub-optimal results.
ABSTRACT There has recently been a great deal of interest in search–based test data generation, w... more ABSTRACT There has recently been a great deal of interest in search–based test data generation, with many local and global search algorithms being proposed. However, to date, there has been no investigation of the relationship between the size of the input domain (the search space) and performance of the search–based algorithms. Static analysis can be used to remove irrelevant variables for a given test data generation problem, thereby reducing the search space size.
Abstract This paper addresses the question:" How can animated visualisation be used to express in... more Abstract This paper addresses the question:" How can animated visualisation be used to express interesting properties of static analysis?" The particular focus is upon static dependence analysis, but the approach adopted in the paper is applicable to other forms of static analysis. The challenge is twofold. First, there is the inherent difficultly of using animation, which is inherently dynamic, as a representation of static analysis, which is not. The paper shows one way in which this apparent contradiction can be overcome.
A program schema defines a class of programs, all of which have identical statement structures, b... more A program schema defines a class of programs, all of which have identical statement structures, but whose expressions may differ. We prove that given any two structured schemas which are conservative, linear and free, it is decidable whether they are equivalent.
Abstract Optimising programs for non-functional properties such as speed, size, throughput, power... more Abstract Optimising programs for non-functional properties such as speed, size, throughput, power consumption and bandwidth can be demanding; pity the poor programmer who is asked to cater for them all at once! We set out an alternate vision for a new kind of software development environment inspired by recent results from Search Based Software Engineering (SBSE).
Abstract This paper presents techniques to integrate boundary overlap into concept assignment usi... more Abstract This paper presents techniques to integrate boundary overlap into concept assignment using Plausible Reasoning. Heuristic search techniques such as Hill climbing and Genetic Algorithms are investigated. A new fitness measure appropriate for overlapping concept assignment is introduced. The new algorithms are compared to randomly generated results and the Genetic Algorithm is shown to be the best of the proposed search algorithms in terms of the quality of concept binding, as measured by the fitness function.
ABSTRACT Companies such as Google tend to develop products from one continually evolving core of ... more ABSTRACT Companies such as Google tend to develop products from one continually evolving core of code. Software is neither shipped, nor released in the traditional sense. It is simply made available, with dramatically compressed release cycles regression testing. This large scale rapid release environment creates challenges for the application of regression test optimisation techniques.
This paper presents an approach to Search Based Software Project Management based on Cooperative ... more This paper presents an approach to Search Based Software Project Management based on Cooperative Co-evolution. Our approach aims to optimize both developers' team staffing and work package scheduling through cooperative co-evolution to achieve early overall completion time. To evaluate our approach, we conducted an empirical study, using data from four real-world software projects. Results indicate that the Co-evolutionary approach significantly outperforms a single population evolutionary algorithm.
Abstract A dependence cluster is a set of program statements, all of which are mutually inter-dep... more Abstract A dependence cluster is a set of program statements, all of which are mutually inter-dependent. This article reports a large scale empirical study of dependence clusters in C program source code. The study reveals that large dependence clusters are surprisingly commonplace. Most of the 45 programs studied have clusters of dependence that consume more than 10% of the whole program. Some even have clusters consuming 80% or more.
Abstract Pair-wise comparison has been successfully utilised in order to prioritise test cases by... more Abstract Pair-wise comparison has been successfully utilised in order to prioritise test cases by exploiting the rich, valuable and unique knowledge of the tester. However, the prohibitively large cost of the pair-wise comparison method prevents it from being applied to large test suites. In this paper, we introduce a cluster-based test case prioritisation technique. By clustering test cases, based on their dynamic runtime behaviour, we can reduce the required number of pair-wise comparisons significantly.
Abstract Generating realistic test data is a major problem for software testers. Realistic test d... more Abstract Generating realistic test data is a major problem for software testers. Realistic test data generation for certain input types is hard to automate and therefore laborious. We propose a novel automated solution to test data generation that exploits existing web services as sources of realistic test data. Our approach is capable of generating realistic test data and also generating data based on tester-specified constraints.
Abstract A slice is constructed by deleting statements from a program whilst preserving some proj... more Abstract A slice is constructed by deleting statements from a program whilst preserving some projection of its semantics. Since Mark Weiser introduced program slicing in 1979, a wide variety of slicing paradigms have been proposed, each of which is based upon a new formulation of the slicing criterion, capturing the semantic projection to be preserved during the process of command deletion. This paper surveys these slicing criteria, attempting to establish a set of parameters which combine to form a slicing criterion.
Abstract Test case prioritisation techniques aim to maximise the chance of fault detection as ear... more Abstract Test case prioritisation techniques aim to maximise the chance of fault detection as early in testing as possible. This is most commonly achieved by prioritising the tests according to a surrogate measure that is thought to correspond to fault detection capabilities, such as code coverage. However, once the prioritised test suite indeed detects a fault, the original prioritisation may become obsolete.
Abstract This paper introduces an approach to web application regression testing, based upon repa... more Abstract This paper introduces an approach to web application regression testing, based upon repair of user session data. The approach is entirely automated. It consists of a white box examination of the structure of the changed web application to detect changes and a set of techniques to map these detected changes onto repair actions. The paper reports the results of experiments that explore both the performance and effectiveness of the approach.
Abstract Like other engineering disciplines, software engineering is typically concerned with nea... more Abstract Like other engineering disciplines, software engineering is typically concerned with near optimal solutions or those which fall within a specified applicable tolerance. More recently, search-based techniques have started to find application in software engineering problem domains. This area of search-based software engineering has its origins in work on search-based testing, which began in the mid 1990s. Already, search-based solutions have been applied to software engineering problems right through the development life cycle.
Uploads
Papers by Mark Harman