[go: up one dir, main page]

0% found this document useful (0 votes)
12 views10 pages

Contract Driven Development

This paper introduces Contract-Driven Development (Cdd), a method that automates the creation of unit tests by leveraging contracts in code, thereby reducing the burden on developers. Cdd captures implicit test cases generated during normal programming activities, allowing for easier maintenance and execution without requiring manual input from the developer. The approach aims to enhance the quality of software testing while minimizing the time and effort traditionally associated with writing unit tests.

Uploaded by

vipij82682
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views10 pages

Contract Driven Development

This paper introduces Contract-Driven Development (Cdd), a method that automates the creation of unit tests by leveraging contracts in code, thereby reducing the burden on developers. Cdd captures implicit test cases generated during normal programming activities, allowing for easier maintenance and execution without requiring manual input from the developer. The approach aims to enhance the quality of software testing while minimizing the time and effort traditionally associated with writing unit tests.

Uploaded by

vipij82682
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Contract Driven Development =

Test Driven Development - Writing Test-Cases

Andreas Leitner, Ilinca Ciupa, Arno Fiva


Manuel Oriol, Bertrand Meyer Chair of Software Engineering
Chair of Software Engineering ETH Zurich, Switzerland
ETH Zurich, Switzerland fivaa@student.ethz.ch
firstname.lastname@inf.ethz.ch

ABSTRACT the software under test and the relationships between


the dierent units.
Although unit tests are recognized as an important tool in
software development, programmers prefer to write code,
• In the absence of explicit, executable specication tools
rather than unit tests. Despite the emergence of tools like
cannot distinguish between meaningful and meaning-
JUnit which automate part of the process, unit testing re-
less input data.
mains a time-consuming, resource-intensive, and not partic-
ularly appealing activity.
• The quality of the generated tests can be estimated
This paper introduces a new method, called Contract-
by dierent means and measures (such as code cover-
Driven Development, that takes the task of writing unit tests
age, mutation testing, number of bugs found, time to
o the developers' shoulders, while still taking advantage of
rst bug, proportion of the fault-revealing tests out of
the developers' knowledge of the intended semantics and
total generated tests, etc), but the exact measures to
structure of the code. This methodology exploits actions
use or combination thereof depends on the character-
that programmers perform anyway as part of the normal
istics of the project under test and is very hard, if not
process of writing code, by extracting test cases from failure-
impossible, for a tool to determine automatically.
producing runs of the system that the programmers trigger.
The approach is based on the presence of contracts in code, This paper shows how to reduce the burden of writing
which act as the oracle of the test cases. The test cases are test cases without interfering with the programmer, while
extracted completely automatically, run in the background still leveraging the insights that he has into the semantics,
when the code evolves, and can easily be maintained over structure, and possible weak spots of the software. The
versions. The tool implementing this methodology is called starting point of this work is the observation that developers
Cdd and is available both in binary and in source form. actually create and run test cases even if they do not create
comprehensive test suites. A developer typically adds some
features to a program and then runs the program in such
1. INTRODUCTION a way that the new feature is used. By placing assertions
Unit tests are an important instrument in software engi- along the way and by watching the output of the program
neering. This is a generally recognized fact, but it does not in general, he checks whether the program works correctly.
change the cumbersome, time-consuming, and boring nature A developer triggers the execution of a newly added feature
of the process of writing meaningful unit tests. via one or both of the following actions:
Consequently, researchers have studied ways to reduce
this burden on the developer, while maintaining or improv- • Providing the right input.
ing the quality of the testing process. One possible solution
is automated testing, illustrated in tools like Agitar One [4], • Changing parts of the program to force a certain path
Parasoft's Jtest [1], AutoTest [12], Korat [5], TestEra [11], to be taken.
DSD-Crasher [6], Eclat [19], Symstra [23], DART [8]. Such
Each execution of the program in such a way tests certain
tools bring a great advantage by the degree of automation
aspects of the program. During the lifecycle of a program
that they provide, but they do face several problems:
many such implicit test cases are created and run. One test

• Automated testing strategies cannot make up for the case evolves into the next, often being an only slightly dier-

insights that a human tester has into the semantics of ent variant of its predecessor. These implicit test cases are
created by humans and do not face the problems mentioned
above that automated synthesizers face. Such test cases
have a very serious drawback however: they are implicit;
usually they exist only for one or very few runs and they
Permission to make digital or hard copies of all or part of this work for cannot be kept for later automatic re-execution because:
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies • If the developer had to provide inputs, the test case
bear this notice and the full citation on the first page. To copy otherwise, to cannot be rerun without the developer providing the
republish, to post on servers or to redistribute to lists, requires prior specific
inputs again. This requires manual intervention and
permission and/or a fee.
Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. the developer needs to remember the exact inputs.

1
• If the developer changed parts of the program to force • Writing unit tests is too much eort for the provided
a certain path to be taken, then this change is unlikely benets: 3.2
to persist as it limits the generality of the application.
• It takes too much time to run the unit tests: 2.1
Usually such a change is undone or altered yet again
to create the next implicit test case. • Unit tests are not useful: 1.8

The results of our study show that the time and eort
Such test cases are easy to run and create and, since they
involved in writing and maintaining unit tests are the most
are not permanent, don't need any maintenance. These are
often occurring causes for the developers' dislike of unit test-
the most likely reasons why developers write and use them
ing, as also indicated by other studies. Various research
so often.
groups have tried to tackle these problems from dierent
This paper presents a method that captures these implicit

au-
stand points. Many approaches try to take the burden o
test cases (including their oracle) and makes them explicit

tomation
the developers' and testers' shoulders by promoting test
and persistent. The captured test cases do not require user
as a solution. Some fully automated or close to full
input, are stable with regards to system evolution, and are
automation testing tools are already available; among them
eciently minimized (and hence their execution time is im-
are DSD-Crasher [6], DART [8], FindBugs [9], AutoTest [12],

Cdd
proved).
1 Jartege [14], Eclat [19], Symstra [23], JTest [1], Agitator [4],
The tool is an implementation of this method. Cdd
Java PathFinder used to generate test inputs [22]. All these
targets Eiel code, because Eiel natively supports con-
tools have the potential of being a great support to devel-
tracts and real world source code equipped with contracts
opers and testers, but they lack the insights into the seman-
is broadly available. Cdd is available both in binary and
tics of the tested applications that a human has and hence
source, and was integrated into the EielStudio development
are very likely to miss bugs that a human would nd eas-
environment. Cdd observes program executions and, when
ily. This paper presents an approach which aims at lling
a failure occurs, Cdd detects the last uninfected state and
this gap that automated testing leaves open: while our tool
takes a snapshot of this state. This snapshot is then recre-
also does not get into the way of the developer, it leverages
ated and serves as the starting point of the extracted test
actions that the developers perform anyway as part of the
case. Cdd chooses the time at which it takes this snapshot
process of writing software.
so that it is early enough for the state to not be infected but
Other approaches also rely on this principle of least in-
also late enough to reduce execution time. Also, in order
terference in established development processes. This is the
to make the test case more robust with regards to system
case for the Agitator [4] tool (currently called AgitarOne).
change, the snapshot does not include that part of the state
While this tool achieves a high degree of automation, it also
that is irrelevant for reproducing the failure.
allows the developer to improve the testing process by in-
The rest of this paper is organized as follows. The next
teracting with the tool in a way which only slightly aects
section motivates the approach. Section 3 gives the intu-
his work ow.
ition behind how Cdd works in practice with the help of
Yet other research directions try to shift this burden of
a use case. Section 4 explains the main abstractions be-
writing tests onto other activities. For instance, several re-
hind Cdd, namely traces, failures, and test cases. Section 5
search groups have investigated test generation from test
presents the implementation, Section 6 outlines future work,
specications [3], [17] and from formal software specica-
and Section 7 concludes.
tions [13]. By introducing parameterize unit tests, Tillmann
et al. [21] simplify the problem of automated input data
2. MOTIVATION synthesis and allow the developer to write more expressive
Writing unit tests during the development of software sys- oracles.
tems brings obvious benets. Although developers are aware Several recent publications discuss the relationship be-
of these benets, they still write only very minimal unit tween Design by Contract and Agile methods: Ostro et
tests. In order to nd the reasons for this, we conducted al. [18] highlights the complementary nature of the tech-
a small scale study on Computer Science students from the niques and Feldmann [7] shows the interplay of contracts
ETH Zurich, asking them various questions about their unit- and refactorings.
test-writing habits. Their degree of experience in writing The relationship between contracts and test driven devel-
software varied widely, from 6 months to 9 years. So did the opment is of particular interest: developing with contracts
number of software projects that they had worked on until has the advantage of making use of a practical, lightweight,
then, from 1 to 5. 45% of the students said that they never and executable form of specication that is able to express
wrote a unit test case before the implementation and 36% more about the intended semantics of the program than a
do it only very seldom. After implementation, only 18% of nite number of test cases can. However, test cases are au-
the students always write unit tests. 54% do it very often. tomatically executable; hence, they can be used for instant
We also asked the students to rank (on a scale from 1 to and continuous verication. Programs equipped with con-
5, where 1 represents total disagreement and 5 total agree- tracts do not have this property, because they lack concrete
ment) the causes that prevent them from writing unit tests. instances satisfying the preconditions of the methods under
The causes and associated average ranks were: test. While the preconditions implicitly specify all valid in-
puts, automatically nding actual instances that fulll these
• Writing unit tests takes too much time: 4.4 preconditions requires a constraint solver. Solving the pre-
conditions present in real software systems is beyond the
• It takes too much eort to maintain the unit tests:
capabilities of today's constraint solvers. The automated
3.6
test case extraction described in this paper is able to pro-
1
http://eielsoftware.origo.ethz.ch/index.php/CddBranch/ vide this missing part and hence unify the advantages of

2
contracts (thorough specication) and test cases (concrete class BANK_ACCOUNT
inherit
and automatically veriable).
With the contract driven development method, the devel- ANY
redene
oper is freed from the task of writing explicit test cases, but
required to provide contracts instead. Contracts do provide
default_create
end
feature
default_create
benets besides the ability to extract test cases: contracts
allow for more precision during design, they serve as docu-
do
mentation throughout the lifecycle by more clearly specify-
ing the semantics of interfaces and they increase chances of balance
end
:= 300

detecting failures closer to their source.


balance INTEGER
deposit an_amount INTEGER
:
( : )
3. A USE CASE do
ensure
To give the intuition behind our proposed method and
the tool that implements it (Cdd), this section presents an
balance_increased
deposited balance balance
:
old balance
old balance an_amount
:
=
>
+
end
withdraw an_amount INTEGER
example of practical use of the tool during the development
of an application. ( : )
do
Our approach builds on the work on continuous testing [20],
by trying to not aect the developer's work ow, but rather balance balance − an_amount
ensure
:=
to exploit actions that the developer performs as part of
the normal development process. An additional advantage
balance_decreased
withdrawn balance balance old balance
old balance an_amount
:
:
=
<
+
of this approach is that it creates unit test suites for those end
developers that did not intend to write unit tests (due to ...
invariant
time constraints or other reasons). While such test suites
might not be complete, they are likely to provide a good balance_not_negativ balance
end
: >= 0
foundation for regression testing, since every test case from
the suite proved to fail at least once during the development
cycle. Listing 1: Class BANK_ACCOUNT
Let us consider an application written in Eiel providing
the means to deposit and withdraw money from a bank ac-
count. Listing 1 shows the main class of this application and
launching it as it is. He starts out with an empty unit test

BANK_ACCOUNT
Figure 1 shows a screenshot of the running application. In
suite. He runs the application from within the debugger of
addition to class (shown in Listing 1),
his IDE (with Cdd support installed) and enters through the
which implements the main business concept of the software

deposit BANK_ACCOUNT
GUI that he wants to deposit 30 EUR. The GUI invokes the
system, the actual application also contains:
method of class , which throws a

• Class MAIN_WINDOW , which represents a GUI win- postcondition error and stops the application. As usual, the
debugger indicates the line of the violation, the current stack
dow showing the current account balance and allowing
the user to enter an amount that he can then deposit frame and the values of the variables in scope. Without the

or withdraw. Cdd extraction mechanism, the developer would have to x

INTERFACE_NAMES
the bug immediately or the failure information would be lost.
• Class , which contains some global When the Cdd extension is installed, the test case extrac-
application constants. tor becomes active when a failure is observed and extracts,

• Class APPLICATION , which serves as the entry point


saves, and runs a test case in the background automatically.
This test case is added to the previously empty test suite,
of the application. It creates a bank account and a
which now looks as shown in Figure 2(a). The extracted
main window, passes the account to the main window,
test case referenced in this window is depicted in Listing 2.
displays the main window, and starts the event loop.
The actual test case does not provide an oracle, since the
postcondition takes over this responsibility.

deposit
The reason why this failure occurs is that there is no im-
plementation yet for method . This is very much in
the spirit of test driven development, where a test case (here
only the contract) is written before the implementation.
As this example shows, the developer has to provide the
inputs triggering the postcondition violation only once; then
Cdd:

• Automatically extracts a test case.


Figure 1: Screenshot of the bank account applica- Minimizes it to that part of the application relevant

tion for the failure.

• Frees it from external state (the GUI) and non-determinism


Note that this example represents an application currently
(the user input).
under development; it contains both incorrect and unn-
ished code. However, the developer tests the application by Similarly, had the user circumvented the GUI programati-

3
cally and hard-coded somewhere an invocation of deposit , 4.1 Traces
Cdd would have extracted the same test case. The test case extraction process is based on abstractions of

class TEST_CASE_1
traces of programs. Traces are what the developer produces
when running the program in the debugger of the IDE.
feature
testlocal The trace abstraction is based on a tree that captures
called_by
ba BANK_ACCOUNT
what called what. Every node in the tree is an
: instruction invocation, i.e. the invocation of an instruction
do
ba new_object "BANK_ACCOUNT"
at a given point in time during the execution that produced

set_eld ba "balance"
:= ( ) the trace at which we are looking. An instruction invocation

check_invariant ba
( , , 300) is a pair of an instruction and the context in which it was

ba deposit
( ) executed. The called_by-tree only knows the following three
. (30)
end kinds of instructions:

end • Object creation (including data allocation and con-

Listing 2: Class BANK_ACCOUNT


structor execution).

• Method call.

class TEST_CASE_2
• Delegate call.

feature
testlocal The purpose of the tree is to enable test case abstraction
and not to model all details of a trace. This is why only the

do
ba BANK_ACCOUNT
:
above three kinds of instructions need to be looked at.
An example graph can be seen in Figure 3. This graph is

ba new_object "BANK_ACCOUNT" based on a trace of the bank account example introduced in

set_eld ba "balance"
:= ( )
Section 3.

check_invariant ba
( , , 300)

ba withdraw
( ) Listing 4 provides some more details of the bank account

APPLICATION
. (20) application introduced previously. We use this example through-
end out this section. Class contains the main
end
event_loop
event loop (which we consider to be the application's en-
try point for our discussion) in method . This
Listing 3: Class BANK_ACCOUNT method is responsible for calling the corresponding sub-
scribers for each observed event. The method uses the Eif-
fel agent mechanism (which is similar in intent to the C#
Since the failure is now reproducible via a test case, the delegate mechanism), where each event keeps a list of its
developer no longer has to x the fault immediately. He

event_loop
subscribers. A subscriber is just another method (that must
can go on and test another aspect of the application. For have been registered before with the event). The
example, he can try to withdraw something from the bank method consists of two nested loops: the outer loop is
account. If he does that, the debugger will once again stop executed once for every event, while the inner loop iter-
the application and signal a fault in the postcondition. In

deposit_amount withdraw_amount
ates over the subscribers of each event and calls these sub-
this case the method was implemented correctly, but the scribers. Methods and
postcondition contains an error. Cdd again automatically respectively subscribe to the events associated to the two
extracts a test case (depicted in Listing 3) for this failure. buttons of the application (deposit and withdraw, as seen
Then it adds this test case to the test suite, and compiles in Figure 1). These methods in turn read the amount en-
and runs it in the background. The test case status window

BANK_ACCOUNT
tered via a text entry box and then call the corresponding
is hence updated to show two failing test cases 2(b). method from class .

deposit
Suppose the developer now adds a correct implementation The context of an instruction invocation contains the pro-

withdraw
for method and xes the postcondition of method gram state in which the instruction has to be executed, plus
. Since Cdd employs continuous testing [20], the the bindings of the instruction. For example a method call
test cases are agged with PASS automatically, as shown in requires one target-object and one object per argument. Let
the test case status window (Figure 2(c)). I be the set of instructions and C the set of contexts, then
the set of instruction invocations is I × C. An instruction
4. MODEL invocation is said to be well formed if its context is such that
This section explains how test cases are extracted from the instruction can be executed from it without any syntax
failures. The rst part of the section introduces the no- or typing errors:
tions of trace, failure, and failure-recipient. The second
part explains how test cases are represented in this model,
wellformed : I × C → B
how their oracle works, and how test cases can be executed
and extracted from failures. The section concludes with an Every instruction invocation executes on a target object.
overview of possible applications to debugging. In the case of a method invocation, the target can either
Note that throughout the section each notion is presented be explicit (qualied call) or implicit (unqualied call). In
through the use of mathematical functions that return the the case of a creation instruction the object being created
needed information, in order to keep notions as simple and serves as target object. Let O be the set of objects. Then
language-agnostic as possible. the signature of target is:

4
(a) After one failure (b) After two failures (c) After xes applied

Figure 2: Test-cases extracted from failures (Screenshot of test-case-window)

class APPLICATION
< i2, c2 > is called by < i1, c1 > if and only if i1, while
executing in context c1, directly (i.e. not indirectly via some
feature other instruction invocation) invokes r2 in C2:

event_loop
...

do called_by : I × C → I ×C
...
from
until
should_quit
loop
event_loop,
c0

wait_for_event
from
ev subscribers start
until
. . wait_for_event, start, after, call, call, forth,

ev subscribers after
c1 c2 c3 c4 c8 c12
. .
loop
evev subscribers
subscribers
.
.
item
forth call.
.
.
Legend
called_by
deposit_amount, withdraw_amount,
end c5 c9

end
i,c < i, c >

end
...
end
class MAIN_WINDOW
deposit, to_integer, withdraw, to_integer,
c6 c7 c10 c11

feature
amount
...

account TEXT_FIELD
BANK_ACCOUNT
:
Figure 3: called_by-tree
deposit_amount
do
:

account deposit amount to_integer


end
. ( . )
Any node in the tree can potentially trigger a failure, i.e.
the execution of the instruction invocation directly triggers
a failure. Failures occur due to a contract violation, method

withdraw_amount
do
call on void target, operating system signals or other kinds

account withdraw amount to_integer


of exceptions. Each programing language will have its own
. ( . ) set of causes. For Eiel the list is given in the Eiel ECMA
end standard Section 8.26.1.
... In the presence of contracts every failure not only has an
end origin (the instruction invocation that immediately triggered
the failure), but also a recipient. Intuitively the recipient is
Listing 4: Partial source for application the method responsible for the failure. In most cases the
recipient and origin are the same instruction invocation.
For Eiel the recipient is dened in the ECMA Standard
Section 8.26.10. The semantics of recipient is extended to
not only mean the receiving method, but also its context.
target : C → O The signature of recipient becomes:

The edges of the called_by-tree indicate which instruction


invocation was called by which other instruction invocation. recipient : F → I ×C
Let < i1, c1 > and < i2, c2 > be two invocations. Then

5
4.2 Test Cases The postcondition of an instruction is dened as the re-

In the present work, a test case is a particular (and hence sult of the evaluation of the postcondition of the instruc-

deterministic) invocation of an instruction and the corre- tion. This is equivalent, similarly to the precondition, to

sponding contracts (which serve as oracles). the evaluation of the postcondition of the called method or

At rst, the notion of an invocation as a test case might constructor. Note that the postcondition is not total in this

seem too restrictive. Tools from the xUnit family (jUnit, case, since it will not be evaluated if the method does not

nUnit, pyUnit, Gobo Eiel Test, VSUnit, etc.) share the terminate.

convention of having test methods contained in test classes. The predicate n-terminates is true if and only if the method
Test cases often consist of many instructions involving con- terminates normally. In the case of Eiel (and Spec#) an

trol ow, object creation, method invocation, and assert in- abnormal termination (such as a null pointer dereference,

structions. Traditional test methods must be created argument- division by zero, operating system signal, etc.) does not

less and deterministic. The developer has to provide the guarantee any postcondition, which is the case described by

corresponding set-up and arguments for the element under the formalism above.

test, turning to mock-objects when the set-up becomes to In JML there is a pair of pre- and postconditions for nor-

complicated. The instructions used here are perfectly ca- mal termination and separate pairs for dierent kinds of

pable to represent such test cases. A method call that rst abnormal termination. The above formalism can be easily

creates the test object, invokes the set-up method and nally adapted to this case. Similarly to the way one big pre- and

invokes the test method. postcondition pair is formed for theorem proving JML an-
notated programs, a big pre- and postcondition pair can be

assert
Conventionally, the oracle for unit tests is provided by cer-
tain library calls or special keywords (e.g. ). Similar used for the oracle predicates above.

to many recent approaches, the present work relies on the It might be confusing at that above the invariant is ap-

presence of embedded and executable specication as oracle plied not only to the context, but to the whole instruction

instead. Such a specication subsumes the traditional ap- invocation. At rst sight, one might be tempted to require

proach, as the library calls or keywords providing the oracle the invariant to hold for all objects in the scope. However,

in traditional unit tests easily integrated with a contracted there is a need to temporarily break the invariant in order to

oracle [10]. In addition to that inspection points can also be allow for object state to change. The exact way in which this

in the middle of the test case: the contracts are interleaved is implemented depends on the contract-enabled language.

into the entire program, and not just present at the level of In Eiel the following rule fullls this purpose: the object

the test case method. which is the target of the currently executing method is al-

Let < i, c > be a test case. It is executed in the following lowed to have its invariant temporarily violated. A method

way: can trigger the execution of another method. Consequently,


more than one object at any given point in time can have a
1. Recreate context c. violated invariant.
It is incorrect to check the invariant of all objects in the
2. Check invariant of c (if violated → invalid test case).
scope. Runtime assertion monitoring is typically imple-

3. Check precondition of i in c (if violated → invalid test mented so that the invariant is checked at the beginning

case). and end of each method execution, also due to performance


reasons. This approach is not applicable for our setting ei-
4. Run instruction i in c (if normal termination → pass, ther, since an invariant breach due to a method call in the
otherwise → fail). middle of a method call operates not on the initial heap, but
on a potentially modied one. Hence it is not clear whether
More formally the oracle of a test case can be dened in
the invariant violation was caused by the instruction invo-
the following way:
cation or was part of the original context. This distinction
is important. For example, a test case might be extracted at

tcvalid : I × C → B rst with a context containing a set of objects that satisfy


invariants, but then, as a result of changes in the program,
tcvalid (i, c) , wellformed(i, c) ∧ inv(i, c) ∧ pre(i, c)
the invariant of a class is strengthened and the extracted
context may contain objects that do not satisfy the new
tcpassing : I × C → B invariant.
Neither complete invariant checking nor the checking em-
tcpassing (i, c) , tcvalid (i, c) ∧ n-terminates(i, c) ∧
ployed by traditional assertion monitoring is appropriate.
post(i, c) Instead, the scope of the invariant check must be broadened
to include the information of the executing instructions and

tcfailing : I × C → B their called_by information. A rst intuition is to check the


invariant of all those objects in the scope, except those which
tcfailing (i, c) , tcvalid (i, c) ∧ ¬(n-terminates(i, c) ∧ are target to any of the methods currently executing (e.g.
post(i, c)) the targets of the instructions in the transitive closure of the

The precondition of an instruction is dened as the result


called_by
express this, the notion of the target set
relation to the current invocation). In order to
of an instruction
of the evaluation of the precondition of the instruction. For
invocation is handy. It is the set of all objects serving as
method and delegate calls this is equivalent to checking the
target to any of the currently executing methods:
precondition of the called method. With object creation
it is equivalent to checking the precondition of the invoked
constructor.

6
Extracting unit test suites from system level tests. Often
developers are more inclined to write system level tests (i.e.
target_set : I × C × O → P(O) black box tests that exercise the whole program) rather than
target_set(i, c) , {target(c′ )| unit level tests (i.e. tests that exercise one method at a
′ ′
hi , c i called_by⋆hi, ci} time). The reason is that a few system level tests can achieve
a reasonably high coverage and hence require less eort for
However, requiring all objects not in the target-set to have creation and maintenance. However, modern development
a valid invariant is overly protective for the purpose of test practices rely on instant feedback to the developer.
case execution. The context might perfectly well contain With the presented test case extraction mechanism it is
objects with broken invariants that are not needed for the possible to extract short running unit level tests automati-
execution of an instruction invocation. In such a case (in- cally from existing system level tests in just one step. When
icted due to natural program evolution) one should not be the IDE signals that the developer is about to change a

necessary
required to throw away the test case. We hence use a notion certain module (in the sense of class or package), relevant
of objects of a program invocation: system level tests can be executed automatically and for
each method invocation of the targeted module a test case
is extracted. This is achieved by generalizing the test suite
necessary : I × C × O → B
notion from above from failures to instruction invocations:

Based on these notions, the nal denition of the invariant


testsuiteinvoc : I × C → P(I × C)
check is:
testsuiteinvoc (i, c) , {tc(i′ , c′ )|
hi, ci called_by⋆hi′ , c′ i}
inv : I × C → B
inv(i, c) , ∀o ∈ O| While the developer changes the module, he gets instant
feedback about whether he broke anything, without the added
(necessary(i, c, o) ∧ ¬(o ∈ target_set(i, c)))
Deep invariant checks for traditional debugging.
overhead of unit test suite maintenance.
=⇒ invobj (o)
where invobj is the invariant of an object as dened in the The traditional approach to runtime monitoring of invari-

class's contracts. ants (checking the invariant at method entry and exit) is a

Given these denitions, extracting the a test case from a compromise between performance and correctness. It cap-

failure becomes very simple: tures many invariant violations, but methods accidentally
violating the invariant of objects other than the target ob-
ject can lead to an infected state that is not discovered at
testcasefailure : F → I × C the time of the infection.
testcasefailure (f ) , recipient(f ) The deep invariant check proposed in Section 4.2 can be
used in such cases, to selectively check the invariant in situ-
4.3 From Test Cases to Debugging ations where one suspects an infected state.

Failure test suites and the fault lifecycle. One of the ad-
vantages of automatic test case extraction is that a developer 5. IMPLEMENTATION
observing a failure has the choice of xing the fault either
As described in Section 2, the main motivation of the
immediately or later, since the failure is automatically repro-
model developed in Section 4 is to extract test cases in a

xing
ducible. Furthermore, Cdd can also provide a benet that
completely automated way while developers program appli-
extends into the process of the fault. Most non-trivial
cations. The approach fundamentally relies on the presence
faults need to be xed in several places, not just in one, and
of contracts. We targeted the Eiel language with our im-
Cdd-style test case extraction can provide guidance for this
plementation because contracts are rst level citizens in this

m
process. For example, an observed failure stemming from
language and practitioners using the language are known to

m
a null pointer dereference in a method can be xed by
provide contracts in real world settings. This provides for a
either changing the implementation of in a way so that
good setting in which the tool can be validated.

m
the null case is treated via a special path, or by strengthen- 2
Our implementation is a modied version of EielStudio ,
ing the precondition of to exclude the case in which the
the predominant IDE for Eiel development. The resulting

m m
reference is null to begin with. Note that, if the developer
tool, Cdd, supports Contract Driven Development as de-
chooses this latter x, now the calling site of violates 's 3
scribed in this paper. The prototype is available for down-

m
precondition. The extracted test case does not prove this
load in both binary and source form under an open source
since it only starts with the execution of . This is just
license. A screenshot of EielStudio integrating Cdd can be
one of several failure evolution scenarios, all of which can be
seen in Figure 4.
coped with by extracting not just one test case per failure
but a whole test suite, that is one test case per method on
5.1 Using Cdd
the call stack:
Test case extraction. Cdd tightly integrates with the de-
bugger of EielStudio in order to extract test cases during
testsuitefailure : F → P(I × C) the regular development cycle of an application. When the

testsuitefailure (f ) , ′ ′
{tc(i , c )| 2
http://www.eiel.com
3
recipient(f ) called_by⋆hi′ , c′ i} http://eielsoftware.origo.ethz.ch/index.php/CddBranch/

7
a selective capture replay mechanism into Cdd.
It should be noted that the extracted test cases are im-
plicitly minimized, due to their nature. Execution consists
of setting up the context and executing the method under
test. Such test cases are only feasible in the presence of
contracts, which serve as oracle. Preconditions also help in
identifying when the context goes invalid due to changes in

Test case visualization.


the program.
The extracted test cases are dis-
played in a tree where they are grouped by the class and the
method that they are testing (see Figures 2). The developer
can choose to see all test cases or only failing ones. It is also
possible to disable the background extraction and execution
of test cases, as well as select an individual test case for de-
bugging (the latter is described in more detail below). Each
test case node displays the status of its last execution and
the assertion violation raised, if any. The developer can use
each test case node to navigate to the source code of the test

Test case execution.


case or to the receiving method.

Figure 4: Screenshot of Cdd-integration in the Eif- Whenever the IDE nds a compilable system, all extracted
felStudio (IDE) test cases for that system get compiled and executed. For
each test case, Cdd rst recreates the context, then checks
the invariant, and nally invokes the method under test with
the created context. The current version of Cdd only checks
developer runs an application and an exception is raised, the
the invariant of the object under test (the thorough invari-
debugger stops the execution and shows the developer the
ant check described in Section 4.2 is work in progress). If
source code line where the exception was raised, the current
the invariant was found to be violated the test case is agged
call stack, and the content of the variables in scope. In ad-
as invalid and this test case is not executed. Otherwise the
dition to that, the test case extractor of Cdd becomes active
method under test is executed with the recreated target ob-
and tries to extract a test case that is able to reproduce
ject and arguments.
the current failure. First the test case extractor determines
During execution, assertion monitoring is enabled and vi-
which method to extract a test case for. This is often (but
olated assertions are reported in the form of exceptions back
not always) the method that raised the exception. As de-
to the test case executor. If the precondition of the method
scribed in Section 4, Cdd chooses the method receiving the
under test has been violated, Cdd ags the test case again
failure as the method under test.
as invalid and does not execute it further. There is an im-
The extractor proceeds to extract a snapshot of the state
portant point to note here: only the violation of the outer-
that is required to invoke the method under test. The tar-
most precondition ags an invalid test case; a precondition
get object and all method arguments and their transitive
evaluated as part of a method call triggered directly or in-
reference closure are serialized. This is an ecient over-
directly from the method under test ags a failing test case
approximation of the set of necessary objects. The result
instead. This is also true if the method under test is re-
is a test case as can be seen in Listings 2 and 3. The cur-
cursive. All exceptions (e.g. invariants, postconditions, pre-
rent implementation of Cdd extracts a test case only for
conditions, check instructions, segmentation faults, division
the failure-receiving method. To improve exibility, future
by zero, etc.) other than the outermost precondition and
releases will extract one test case for each method of the
invariant check signal a failing test case.
current call stack as described in Section 4.3.
Background execution of test cases allows the developer
The Cdd implementation poses no runtime overhead dur-
to always see the latest state of the test cases. However,
ing debugging, since the extractor becomes active only at
in addition to this, the developer may want to more clearly
the time of an observed failure. At this point the debug-
understand why a particular test case fails. For this case
ger stops the application anyway, so extraction takes place
Cdd allows him to execute a test case in the regular debug-
when the application is not running. The application under
ger. When doing this on a failing test case, the debugger
test is not instrumented or altered in any way. This has the
will automatically stop at the exception being raised and
clear advantage of interfering as little with the developer's
thus allow the developer to inspect the concrete values. Ad-

a priori
working habits as possible.
ditionally the developer can set breakpoints and thus step
It is not possible to know that a given call will fail,
through the test case (including the method under test) line
so Cdd extracts the state after the failure. Instrumenting
by line and inspect the state at method entry point and how
every method to capture its prestate has prohibitive perfor-
this state evolves.
mance overhead. In most cases, extracting the state after
the failure is sucient in order to obtain a test case that
exhibits the same error. In some cases though, it is not 5.2 Architecture
sucient and the only possibility is to replay the program Cdd is implemented as a modication to the EielStudio
with a pre-state capture. Recent advances in capture and IDE. The implementation consists of approximately 50 new
replay [16, 15] promise ner grained control and much better classes totaling to around 6000 lines of code. The addition is
performance. We are currently working on integrating such relatively small compared to EielStudio itself ( 1400 classes

8
System under from all dependencies that this part introduces. We are
Debugger currently working on a selective capture and replay imple-
Test
mentation for Eiel, which will provide signicant benets

Prestate extraction.
in the following areas:
We currently extract the state at the
State
time of the failure, which can result in the wrong state be-
Extractor
ing captured in some cases. A solution would be a posteriori
user-guided extraction, but this would require manual inter-
vention. Given the right border, selective capture and replay
Test Case
makes it possible to capture all executions by default, while
Serializer
inicting minimal performance overhead. Replays can then
be run completely automatically in the background, which

Non-determinism.
would remove the need for the user's intervention.
Compiler &
Display While most programming languages
Executor
do not provide a source for non-determinism directly, pro-
grammers can typically use the foreign function interface for
external inputs (user input, network, database, etc.), and
Figure 5: Architecture of Cdd implementation this can be considered a cause of non-determinism within
the program. With selective capture and replay it is possi-
ble to put all sources of non-determinism into the external
and 2 million lines of code) and, to keep maintenance eorts part, which makes replays completely deterministic. Even
reasonable, we kept the number of existing classes that we if a failure-producing trace was dependent on certain GUI
modied to a minimum. The classes of the extension can be inputs, network connections, or data base state, the replay

External state.
roughly divided into groups achieving the following: is completely freed from these dependencies.
The foreign function interface also intro-
• Model (internal representation of test cases) duces references to outside data (e.g. window handles, le
handles, pointer to partially untyped C values) and this data
• State and code extraction using the interface of the
that cannot be reected. Such state is never directly manip-
debugger
ulated within the program; external code is invoked instead.

• Test case serialization The implicit state minimization of selective capture and re-
play removes this dependency on the external state (again
• Compilation and execution of the serialized test cases given that all external code is part of the unobserved code).

• Test harness (simple unit testing framework)


7. CONCLUSIONS
• Visualization and user interface This article explains the fundamentals of the Contract
Driven Development approach. A tool autonomously ob-
• Example code
serves the developer while he is working on a program and
extracts test cases from failures either provoked by the devel-
Figure 5 shows the basic control ow. The debugger is in
oper (in the spirit of test driven development) or by mistake
control over the system under test (the application that the
(leading to a regression test). The approach is novel in that
developer is working on). When a failure occurs, the state
complete test cases are extracted not only from the infor-
extractor retrieves information from the debugger about the
mation provided by the system under test, but also from
current state of the application. The test case serializer saves
non-permanent clues given by the programmer during de-
this state into a compilable unit test case. The resulting test
velopment.
case is then compiled and executed and nally the results
The approach is introduced by a case study and explained
are displayed.
via a language agnostic model, applicable to arbitrary con-
tracted code. The extracted test cases are both fast execut-
6. FUTURE WORK ing, small, and stripped of many dependencies. To aid the
Our current implementation does not cope with concur- debugging process, failure test suites can be extracted and
rent applications. The SCOOP [2] mechanism extends the give the developer guidance during multi-step correction cy-
semantics of contracts to the concurrent case. A contract cles. As a corollary to test case extraction based on failures,
equipped language supporting SCOOP will allow us to han- we show how the approach can be used to automatically
dle concurrent applications with minimal modications on extract fast executing unit test from slow executing system
our implementation. level tests.
The recently introduced selective capture and replay [16, The Cdd tool implements the idea of contract driven de-
15] mechanism, promises much increased performance of velopment. Cdd is integrated into EielStudio, a major
both the capture and replay phase. It is based on the idea Eiel IDE. Developers can use the tool without changing
that a program is divided into two parts, an observed part their development process, since the approach is completely
and an external part. Instead of capturing state or state non-intrusive. Cdd oers the advantages of automatic test
changes, all in and outgoing events are recorded instead. In case extraction from actions that developers perform anyway
and outgoing data only needs to be recorded in a shallow while writing source code, and thus builds up a comprehen-
fashion. For replaying a run, the replay-harness is able to sive unit test suite and oers support for functional testing,
replace the external part completely, freeing the application debugging, and regression testing.

9
8. REFERENCES unit tests for java classes. Tech. Rep. RR-1069-I,

[1] Jtest. parasoft corporation. Centre National de la Recherche Scientique, Institut

http://www.parasoft.com/. National Polytechnique de Grenoble, Universit´e


Joseph Fourier Grenoble I, June 2004.
[2] Arslan, V., Eugster, P., Nienaltowski, P., and
[15] Orso, A., Joshi, S., Burger, M., and Zeller, A.

Dependable Systems: Software, Computing, Networks


Vaucouleur, S. SCOOP - concurrency made easy.
Isolating relevant component interactions with jinsi.

Proceedings of the
(2006). [16] Orso, A., and Kennedy, B. Selective Capture and

Third International ICSE Workshop on Dynamic


[3] Balcer, M., Hasling, W., and Ostrand, T. Replay of Program Executions. In

TAV3: Proceedings of the ACM


Automatic generation of test scripts from formal test
Analysis (WODA 2005) (St. Louis, MO, USA, may

SIGSOFT '89 third symposium on Software testing,


specications. In
2005), pp. 2935.

analysis, and verication (New York, NY, USA, [17] Ostrand, T. J., and Balcer, M. J. The

Commun. ACM 31
1989), ACM Press, pp. 210218. category-partition method for specifying and

[4] Boshernitsan, M., Doong, R., and Savoia, A. generating fuctional tests. , 6

From daikon to agitator: lessons and challenges in (1988), 676686.

ISSTA '06: Proceedings of the 2006 international XP


building a commercial tool for developer testing. In [18] Ostroff, J. S., Makalsky, D., and Paige, R. F.

symposium on Software testing and analysis


Agile specication-driven development. In (2004),

Lecture Notes in Computer Science


(New J. Eckstein and H. Baumeister, Eds., vol. 3092 of
York, NY, USA, 2006), ACM Press, pp. 169180. , Springer,

[5] Boyapati, C., Khurshid, S., and Marinov, D. pp. 104112.

Proceedings of the 2002 ACM SIGSOFT International ECOOP


Korat: automated testing based on java predicates. In [19] Pacheco, C., and Ernst, M. D. Eclat: Automatic

Symposium on Software Testing and Analysis (ISSTA 2005  Object-Oriented Programming, 19th European
generation and classication of test inputs. In

2002), Rome, Italy (2002). Conference (Glasgow, Scotland, July 2529, 2005).

[6] Csallner, C., and Smaragdakis, Y. Dsd-crasher: [20] Saff, D., and Ernst, M. D. An experimental

International Symposium on Software Testing and ISSTA 2004, Proceedings of the 2004 International
A hybrid analysis tool for bug nding. In evaluation of continuous testing during development.

Analysis (ISSTA) Symposium on Software Testing and Analysis


In
(July 2006), pp. 245254. (Boston,
MA, USA, July 1214, 2004), pp. 7685.

Fourth Int'l Conf. Extreme Programming and Agile


[7] Feldman, Y. A. Extreme design by contract. In

Processes in Software Engineering (XP 2003)


[21] Tillmann, N., and Schulte, W. Parameterized
(2003), unit tests with unit meister, 2005.

ISSTA
Springer Verlag, pp. 261270. [22] Visser, W., Pasareanu, C. S., and Khurshid, S.

PLDI '05: '04: Proceedings of the 2004 ACM SIGSOFT


[8] Godefroid, P., Klarlund, N., and Sen, K. Dart: Test input generation with java pathnder. In

Proceedings of the 2005 ACM SIGPLAN conference


directed automated random testing. In
international symposium on Software testing and
on Programming language design and implementation analysis (New York, NY, USA, 2004), ACM Press,
(New York, NY, USA, 2005), ACM Press, pp. 97107.
pp. 213223. [23] Xie, T., Marinov, D., Schulte, W., and Notkin,

SIGPLAN Not. 39
[9] Hovemeyer, D., and Pugh, W. Finding bugs is D. Symstra: A framework for generating

Proceedings of the 11th International Conference on


easy. , 12 (2004), 92106. object-oriented unit tests using symbolic execution. In

[10] Leitner, A., Ciupa, I., Meyer, B., and Howard,


Tools and Algorithms for the Construction and
Proceedings of the 40th Hawaii
M. Reconciling manual and automated testing: the
Analysis of Systems (TACAS 05) (April 2005),

International Conference on System Sciences - 2007,


autotest experience. In
pp. 365381.

Software Technology (January 3-6 2007).


8.1 Acknowledgements
[11] Marinov, D., and Khurshid, S. TestEra: A novel
We thank Jocelyn Fiat and Manu Stapf for their support

Proc. 16th IEEE International Conference on


framework for automated testing of Java programs. In
and insights on the compiler and IDE. We are especially

Automated Software Engineering (ASE) (2001),


grateful to Andreas Zeller and Martin Burger with whom
we had many discussions that helped shape this work. We
pp. 2234.
thank Bernd Schoeller for many fruitful discussions.
[12] Meyer, B., Ciupa, I., Leitner, A., and Liu, L. L.

Proceedings of SOFSEM 2007 (Current Trends in


Automatic testing of object-oriented software. In

Theory and Practice of Computer Science) (2007),


J. van Leeuwen, Ed., Lecture Notes in Computer
Science, Springer-Verlag.

Proceedings of
[13] Offutt, A. J., Xiong, Y., and Liu, S. Criteria for

the Fifth IEEE International Conference on


generating specication-based tests. In

Engineering of Complex Computer Systems (ICECCS


'99) (October 1999), pp. 119131.
[14] Oriat, C. Jartege: a tool for random generation of

10

You might also like