Integration Testing: Rog Erio Paulo January 12, 2007
Integration Testing: Rog Erio Paulo January 12, 2007
Integration Testing: Rog Erio Paulo January 12, 2007
Rogerio Paulo
This document is written as a dissertation to complete the module CS339 Advanced Topics in Computer Science:
Testing in University of Wales Swansea. The referencing materials are mainly based on the book Software Testing
A Craftmans Approach by Paul C. Jorgensen, also referred as [Jor02].
Nowadays testing plays a major role in the software development cycle. Generally, software development firms
invest a considerable amount of resources in testing to make sure the software developed meets the specification and
standard for the user or customer. Since problems that arise after the software release can cause the company to lose
revenue and reputation, or worse, it can cause environmental damage, injuries and deaths.
In this document, we will discuss one of the testing phase, integration testing, and its different strategies. Namely
decomposition-based integration, call-graph based integration and path-based integration.
We will see that for each strategy there are different approaches to achieve the same main concept but with different
requirements and consequences. For each approach there will be a case study where a closer to real-life example will
be shown and discussed.
Finally, at the end of this document there will be a conclusion based on my research and personal experiences.
Table of Contents
Table of Contents 1
Introduction 2
1 Decomposition-based Integration 7
1.1 Big bang Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Top-down Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Bottom-up Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Sandwich Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Path-based Integration 19
3.1 MM-Path based Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Conclusion 22
Bibliography 23
1
Introduction
In the classic software development model, V-Model (Figure 1), integration testing is a software testing phase located
before system testing and after unit testing. Integration testing has the goal of proving that the features developed work
together well enough for the project to be submitted for system testing.
In this model, unit testing is performed mainly by programmers and system testing is performed mainly by testers;
integration testing offers the opportunity for the programmers (people with high knowledge on the code) to work with
the testers (people with high technical expertise in testing), bridging unit testing and system testing thus allowing the
transition to be smoother. Generally, the larger and more complex the project is, the more important is the integration
testing.
As integration testing is performed after unit testing, we will assume that all units have been tested separately. The
method of grouping the modules of the system is determined by the strategy and the approach. It is also important to
note that these strategies and approaches are targetted to traditional procedural programming languages.
Different strategies are needed because there are different requirements and resource allowance. For instance, in
very small systems grouping all the modules and testing them in a single phase may be acceptable, however, for larger
systems this is impractical. The reasons and solutions for these problems will be discussed in the later chapters.
We will first introduce the example that will be using during this document, the NextDate program. Then we will
discuss the different approaches of integration testing and how it is applied to our example:
Chapter 1, we will talk about decomposition-based integration. We will see the four main approaches in this
strategy: big bang, top-down, bottom-up and sandwich; the advantages and disadvantages of them and examples
of their application will be given.
Chapter 2, we will examine call graph-based integration and its general improvement made towards decomposition-
based integration. Namely the approaches known as pair-wise and neighbourhood. Again, their strong and weak
points will be discussed with concrete examples shown.
Chapter 3, we will discuss about path-based integration. We will talk mainly about an approach known as
MM-path based integration, its advantages and disadvantages. A concrete example will be used to explain how
it is applied.
2
The NextDate Program
This example is taken directly from [Jor02], the way the pseudo-code is shown has been modified. However, the main
functional and structural integrity of the program is kept. This program uses three variables: month, date and year.
With the input, it returns the next date of the inputted date. It has the following characteristics:
Checks for valid input:
Must be positive integers
A month must be: 1 month 12
A day must be: 1 day 31
A year must be: 1812 2012
Takes leap years into consideration:
input : A date
output : Next date of the day inputted
Type Date: Integer month,Integer day,Integer year
Date : today,tomorrow,aDate
1 Main integrationNextDate
2 getDate(today); /* message 1 */
3 printDate(today); /* message 2 */
4 tomorrow = incrementDate(today); /* message 3 */
5 printDate(tomorrow); /* message 4 */
6 End Main
Function(Main) integrationNextDate(date)
We want to keep the main part simple, thus we just call functions from other parts. Here we get the date from the
user input with getDate, we print it with printDate then we calculate the next date with incrementDate, finally, printing
the results with printDate again.
Also note that Date is a new datatype that contains three other integer variables: month, day and year. We also
declared today, tomorrow and aDate as variables of the type Date.
input : A year
output: True if is a leap year, otherwise false
7 Function Boolean isLeap(year)
8 if year divisible by 4 then
9 if year is NOT divisible by 100 then isLeap = True
10 else if year is divisible by 400 then isLeap = True
11 else isLeap = False
12 else isLeap = False
13 End (Function isLeap)
Function isLeap(year)
3
This function isLeap checks if the year from the input is a leap year or not. We first check if the year is divisible
by 4, if it is, then there is possibility that it is a leap year, otherwise it is not. In case it is divisible by 4, we continue to
check if the year is a century year or not. If it is not, then the year is a leap year. Else, we proceed to check the next
condition, if the year is a multiple of 400. If it is then it is a leap year, else, it is not.
In lastDayOfMonth we calculate the upperbound of the number of days that exist in a certain month. This is needed
because there are months with 30 and 31 days; with February a special case where it has 28 days in a standard year
and 29 days in a leap year.
This function is handled mainly by switch-case statements. For the first case, we are saying that the months:
January, March, May, July, August, October and December have 31 days. For the second case, the months: April,
June, September and November have 30 days. Finally, for February we are saying that it has 29 days if it is a leap
year, otherwise it has only 28 days.
input : A date
output: True if the date is valid, otherwise false
24 Function Boolean validDate(aDate)
25 if (aDate.Month > 0) (aDate.Month <= 12) then monthOK = True
26 else monthOK = False
27 if monthOK then
28 if (aDate.Day > 0) (aDate.Day <= lastDayOfMonth(aDate.Month,aDate.Year)) then dayOK =
True; /* message 6 */
29 else dayOK = False
30 endif
31 if (aDate.Year > 1811) (aDate.Year <= 2012) then yearOK = True
32 else yearOK = False
33 if monthOK dayOK yearOK then validDate = True
34 else validDate = False
35 End (Function validDate)
Function validDate(aDate)
Since there are certain conditions that must be met for a date to be valid, for our program we should have a part to
make sure these conditions are met. Especially if the date is from user input.
In validDate we will be checking these conditions separately:
Month - We return true if the value is between 1 and 12 (inclusive), else, we return false.
4
Day - We return true if the value is between 1 and the maximum day of the month (inclusive); the maximum day of
the month is retrieved by using lastDayOfMonth. If this condition is not met, then we return false.
Year - Here we will be placing an artificial boundry to our program. We are just allowing years between the year of
1812 and 2012 (inclusive), even if there are other values that are also considered valid. Again, we return true if
this condition is met, else, false is returned.
After checking all three variables (month,day and year), we will just consider that the date is valid if all three
conditions are met (all of them returned true), otherwise we will be rejecting the date.
By using getDate we will be able to retrieve and store the date from the user input. Here we request the inputs by
printing to the screen the desired variable using Output. After each request, we retrieve the value inputted by the user
to the corresponding variable using Input. Finally, we stored the retrieved values to the corresponding variable. These
statements are inside the repeat block, thus if the user inputs an invalid date, the program will keep requesting a new
date until a valid one is inputted.
input : A date
output: The date incremented
49 Function Date incrementDate(aDate)
50 if aDate.Day < lastDayOfMonth(aDate.Month) then aDate.Day = aDate.Day + 1; /* message 8 */
51 else
52 aDate.Day = 1
53 if aDate.Month = 12 then
54 aDate.Month = 1
55 aDate.Year = aDate.Year + 1
56 else aDate.Month = aDate.Month + 1
57 endif
58 End (Function incrementDate)
Function incrementDate(aDate)
With incrementDate we implement the main feature of this program. With a date as an input, we output a date that
corresponds to the following date of the input. To achieve that we first check the day of the date inputted, if it is the
5
last day of the month using lastDayOfMonth. If it is not, then we can simply increase the day by one. If it is then we
will need to change more variables.
First, we set the day to 1; this is because the successor of the last day of a month is the first day of the next month.
Then we must check if the month is the last month of the year, which is 12 (December). If it is, then we must set the
month as 1 (January) and increment the year, which corresponds to the first day of a new year. Else, we can simply
increment the value of month by 1.
input : A date
output: The date in string
59 Procedure String printDate(aDate)
60 Output(Day is ,aDate.Month,/,aDate.Day,/,aDate.Year)
61 End (Procedure printDate)
Procedure printDate(aDate)
Here with printDate, we simply convert the date with the datetype of date to a string. This is done so that it can be
printed on the screen and allow us to put additional strings or information.
6
Chapter 1
Decomposition-based Integration
In this strategy, we do the decomposition based on the functional characteristics of the system. A functional charac-
teristic is defined by what the module does, that is, actions or activities performed by the module. In this strategy our
main goal is to test the interfaces among separately tested units.
There are four approaches for this strategy:
Big bang integration
Top-down integration
Bottom-up integration
Sandwich integration
Briefly, big-bang groups the whole system and test it in a single test phase. Top-down starts at the root of the tree
and slowly work to lower level of the tree. Bottom-up mirrors top-down, it starts at the lower level implementation of
the system and work towards the main. Sandwich is an approach that combines both top-down and bottom-up.
Note that many different sources refers decomposition-based integration, if not the only one, the main strategy used
for integration testing, however, this is not true. In later chapters we will see strategies that are not decomposition-
based but they still allow us to achieve a systematic test for the integration phase.
7
Figure 1.1: Big bang integration, coverage of a test session.
M ain
isLeap lastD ayO fM onth validD ate getD ate increm entD ate printD ate
1.1.2 Summary
In summary, big bang integration has the following characteristics:
Considers the whole system as a subsystem
Tests all the modules in a single test session
Only one integration testing session
Advantages: Disadvantages:
Low resources requirement Not systematic
Does not require extra coding Hard to locate problems
Hard to create test cases
8
1.2 Top-down Integration
In top-down integration, we start at the target node at root of the functional decomposition tree and work toward the
leaves. Figure 1.3 shows different sessions of integration testing. Stubs2 are used to replace the children nodes attached
to the target node. A test phase consists in replacing one of the stub modules with the real code and test the resulting
subsystem. If no problem is encountered then we do the next test phase. If all the children were replaced by real code
at least once and meet the requirements then we move down to the next level. Now we can replace the higher level
tested modules with real code and continue the integration testing.
This testing approach finish when the whole system is covered by using the method above.
Top-down integration allows us to have a prototype of the SUT at early stages. This is a very big advantage in many
customer based software development. In many cases, companies suffer large software development cost because of
a bad design that is just identified at the later stage of the development cycle, forcing the developing team to restart
at the designing phase. By having an early SUT prototype many design issues can be identified early thus rectified.
Consequently, by using top-down integration, we will be able to interleave design and implementation.
Top-down integration has the drawback of requiring stubs. While stubs are simpler than the real code, it is not
straightforward to write them; the emulation must be complete and realistic, that is, the test cases results ran on the
stub should match with the results on the real code. Being a throw-away code, it does not reach the final product nor
it will increase functionality of the software thus it is extra programming effort without a direct reward.
Another issue of starting at higher level is that we are forced to abstract both the application layer and the hardware
layer. For systems where it requires strict hardware compatibility, for example, embedded systems, this will create
serious issues at later stages.
Finally, top-down integration starts at the higher level of the software thus the expected outputs are more complex,
making the creation of test cases also harder.
For top-down integration the number of integration testing sessions is nnodes nleaves + nedges .
9
1.2.1 Case Study
With our example program nextDate, this integration approach is not very useful. This is mainly because this is a very
simple system (thus the functional decomposition tree does not have many levels) and secondly the main program
does not have many features or complex user interface.
In top-down integration, we would be starting with Main as a target node and replace the children nodes one by
one with stubs (only one stub in each test session). We must build the stub such that it returns correct values to the real
module and compatible to the test cases. A possible stub for incrementDate could be:
input : A date
output: The following day of the inputted date
Date : d31121999,d28022000,d28021999,next
1 Function Date incrementDate(aDate)
2 d12311999 = Date(12,31,1999)
3 d02282000 = Date(02,28,2000)
4 d02281999 = Date(02,28,1999)
5 if aDate == d12311999 then next = Date(01,01,2000)
6 else if aDate == d02282000 then next = Date(02,29,2000)
7 else if aDate == d02281999 then next = Date(03,01,1999)
8 End (Function incrementDate)
Function lastDayOfMonth(month,year)
In a way, the test cases will be limited by how and what we code in the stub. This also means that we are limiting
the area we must look at when problems arise. For instance, if both incrementDate and getDate is replaced by stubs
but there is a mismatch of results in the output, then most likely the problem is at printDate.
Another thing to note about this approach in this example is that there are empty test sessions. The functions
isLeap, lastDayOfMonth and validDate are never called by Main directly so we will not be able to isolate them with
stubs with a top-down approach thus no test cases are created in these sessions (Figure 1.4).
N ode w ith real code M ain N odes that w ill be replaced by stubs
isLeap lastD ayO fM onth validD ate getD ate increm entD ate printD ate
1.2.2 Summary
In summary, top-down integration has the following characteristics:
Integration starts at the main program
Moves from the higher level modules to the lower level modules
Has nnodes nleaves + nedges number of integration testing sessions
10
Advantages: Disadvantages:
Early SUT prototype Throw-away code programming
Interleaves design and implementation Late interaction tests between the main program,
the application layer and the hardware
Difficult to create test cases
11
input : None (Hardcoded test cases)
output: True if it match the expected result, otherwise false
9 Function Integer lastDayOfMonth driver()
10 Output(Pass 1900: ,(isLeap(1900) == False))
11 Output(Pass 1999: ,(isLeap(1999) == False))
12 Output(Pass 2000: ,(isLeap(2000) == True))
13 End (Function lastDayOfMonth driver)
Function lastDayOfMonth driver
In case of there are results that does not match with the expected ones, we will know that isLeap most likely is
the module that is causing the problem. For example, if during the test session the returning value of the years 1900
and 2000 does not correspond to the expected result then we would check the code that handles century years in the
module isLeap.
isLeap lastD ayO fM onth validD ate getD ate increm entD ate printD ate
1.3.2 Summary
In summary, bottom-up integration has the following characteristics:
Advantages: Disadvantages:
12
Sandwich integration uses a mixed-up approach where we use stubs at the higher level of the tree and drivers at
the lower level (Figure 1.7). The testing direction starts from both side of tree and converges to the centre, thus the
term sandwich. This will allow us to test both the top and bottom layers in parallel and decrease the number of stubs
and drivers required in integration testing.
While this approach decreases the number of throw-away code required, it is still a requirement. Due of the nature
of combining both approach and starting at both side of the tree, we are in fact doing a smaller version of big bang
inside our functional decomposition tree. Consequently, it is also more difficult of isolate problems.
For sandwich integration, the number of integration testing sessions varies but the maximum number of sessions
is the number of subtrees that exists in the functional decomposition tree.
isLeap lastD ayO fM onth validD ate getD ate increm entD ate printD ate
1.4.2 Summary
In summary, sandwich integration has the following characteristics:
Combines top-down approach and bottom-up approach
13
Generally, higher level modules use a top-down approach (stub)
Normally, lower level modules use a bottom-up approach (driver)
Testing converges to the middle
Number of integration sessions can vary
The maximum number of sessions is the number of subtrees of the system
Advantages: Disadvantages:
Top and bottom layers can be done in parallel
Still requires throw-away code programming
Less stubs and drivers needed
Partial big bang integration
Easy to contruct test cases
Better coverage control Hard to isolate problems
14
Chapter 2
Call graph-based integration is an improvement over the approaches based on the functional decomposition. This is
done by moving to the direction of structural testing. Here we use a directed graph instead of a functional decomposi-
tion tree to represent the program. The system is presented as a directed graph where the nodes are the modules and
the edges represent function invocations.
There are two main approaches in call graph-based integration:
Pair-wise integration
Neighbourhood integration
In pair-wise integration, we restrict a test session with only a pair of modules, whereas in neighbourhood integration
we group the modules around the target node as a subsystem to be tested.
2 1
6 3 4 5
9
7 8
10
Figure 2.1: Pair-wise integration, some pairs that are used for different testing sessions.
15
2.1.1 Case Study
By going to a call graph-based strategy, we will be able to eliminate the problem of empty test sessions. This is
because the edges actually represent the calls and functional dependencies of the modules.
To use pair-wise integration in this example, we pair-up the modules according to the edges of the call graph
(Figure 2.2). For instance, lastDayOfMonth with isLeap and Main with printDate. Each of these pairs will then
constitute a subsystem for a test session. With this call graph, there will be seven test sessions because there are seven
edges.
M ain
Figure 2.2: Two pairs used in pairwise integration of the nextDate program.
2.1.2 Summary
In summary, pair-wise integration has the following characteristics:
Each test session is restricted to only a pair of modules
Module pairing is based on the edges of the call graph
The number of integration testing sessions is the number of edges
Advantages: Disadvantages:
Stub and driver need is eliminated Many test sessions
Use of real code
16
2.2 Neighbourhood Integration
While pair-wise integration eliminates the need of stub and driver, it still requires many test cases. As an attempt of
improving from pair-wise, neighbourhood requires fewer test cases.
In neighbourhood integration, we create a subsystem for a test session by having a target node and grouping all the
nodes near it (Figure 2.3. Near is defined as nodes that are linked to the target node that is an immediate predecessor
or successor of it.
By doing this we will be able to reduce considerably the amount of test sessions required. Notice that the predeces-
sors and successors correspond to the modules that are replaced by stubs and drivers in the functional decomposition
tree. This means that we are doing something what sandwich integration does, smaller version of big bang integration
in the tree. Consequently, this causes difficulty in isolating faults.
For neighbourhood integration the number of integration testing sessions is nnodes n sinknodes .
2 1 2 1 2 1
6 3 4 5 6 3 4 5 6 3 4 5
9 9 9
7 8 7 8 7 8
10 10 10
M ain
As we can see, we group the modules such that all the modules that are immediate successors or predecessor of the
target node are within the same subsystem. Each of these subsystems will be used for separate test session. Figure 2.4
show the neighbourhood for the module lastDayOfMonth. While this approach decreases the number of test sessions
needed, it also makes it harder to isolate faults.
17
2.2.2 Summary
In summary, neighbourhood integration has the following characteristics:
Modules are grouped as neighbourhoods
A neighbour module is a module that is the immediate successor or predecessor of another unit
The number of integration testing sessions is nnodes n sinknodes
Advantages: Disadvantages:
Stub and driver need is eliminated Hard to isolate faults
Use of real code
Reduction of test sessions
18
Chapter 3
Path-based Integration
By moving to path-based integration we will be approaching integration testing from a new direction. In decomposition-
based testing we use a structural approach and in call-graph base testing we use a functional approach. Here we will try
to combine both structural and functional approach in path-base integration. Finally, instead of testing the interfaces
(which are structural), we will be testing the interactions (which are behavioural).
Here, when a unit is executed certain path of source statements is traversed. When this unit calls source statements
from another unit, the control is passed from the calling unit to the called unit. For integration testing we treat these
unit calls as an exit followed by an entry.
The main path-based integration approach that we will be discussing is:
MM-Path1 based integration
In MM-path based integration we track all the modules execution paths2 and messages3 used in the system. The
modules traversed by this path is then used as a subsystem to be tested.
Note that in path-based integration we will introduce two new types of nodes:
Source node4
Sink node5
Message Quiescence
Data Quiescence
1 Method/Message-Path an interleaved sequence of module execution paths and messages.
2
Module Execution Path a sequence of statements that begins with a source node and ends with a sink node.
3 Message a programming language mechanism by which one unit transfers control to another unit.
4 Source Node a statement fragment at which program execution begins or resumes.
5 Sink Node a statement fragment at which program execution terminates.
19
It is said to be message quiescence when we arrive to a unit that sends no messages. Data quiescence is reached
when we terminate a set of execution with the creation of data that is stored but not used immediately.
With these definitions, we will be able to do an integration testing that combines both structural and functional
approach, allowing a closer coupling with the actual system behaviour. Avoiding the drawbacks from a structural
based approach and allowing us to have a smoother transition from integration testing to system testing (where the use
of behavioural threads is desired).
However, to use a MM-Path based integration we must put extra effort to identify the MM-Paths; which may be
compensated by the elimination of stub and driver.
For MM-Path based integration the number of integration testing sessions is dependant to the system in question.
1 1 1
3 2 2 3
2
4 3 4
5 4 5
A B C
Main(1,2)
message1
getDate(36,37,38,39,40,41,42,43,44,45,46,47)
message7
validDate(24,25,26,27,28)
message6
lastDayOfMonth(14,15,16,23); /* Point of quiescence */
validDate(28,30,31,33,35)
getDate(48)
Main(3)
A main problem when using this approach is knowing how many MM-Paths are required to complete the integra-
tion test. The set of MM-Paths should traverse all source-to-sink paths.
A large number of paths (or even infinite) caused by loops can be reduced by condensation graphs6 of the directed
acyclic graphs7.
6 Condensation Graph a graph CG = (CV, CE) based on the graph G = (V, E) where each vertex in CV corresponds to a strongly connected
component in G and edge (u, v) is in CE if and only if there exists an edge in E connecting any of the vertices in the component of u to any of the
vertices in the component of v.
7 Directed Acyclic Graph a directed graph with no directed cycles.
20
3.1.2 Summary
In summary, MM-Path integration has the following characteristics:
Messages sent between modules are tracked
The set of MM-Paths should cover all source-to-sink paths
Points of quiescence are natural endpoints for an MM-Path
The number of integration testing sessions is dependant to the system in question
Advantages: Disadvantages:
Hybrid of functional and structural testing Extra effort required to identify the MM-Paths
Closely coupled with actual system behaviour
Does not require stub or driver
21
Conclusion
It is important to note again that all the strategies and approaches mentioned are targeted for the traditional procedural
programming languages. Still, the foundation and many basic concepts can be applied to object-oriented programming
languages by undergoing some changes.
For instance, in many situations we have assumed that the units are able to behave correctly independently from
other units of the same system. This assumption will not be able to be applied in object-oriented applications as
commonly as procedural ones because the units are tightly coupled; making the use of stub and driver more frequent.
It is also important to remind that at first sight, integration testing and system testing seems to be testing the system
as a whole (all units are present and complete) but their goals are different. In integration testing we are more concerned
in finding faults when the units are integrated. While in system testing we are more concerned in demonstrating the
performance of the system.
With all these testing strategies and approaches we are able to test different type of systems more effectively and
more systematically. For each system there may be an optimal testing strategy and approach that should be used.
However, due to the demands in the software development industry and limiting factors like lack of time, capital and
technical expertise such choices are not always possible or as complete as desired.
Yet, I believe that we should not put testing as something to be avoided or just for the completeness of the soft-
ware development cycle. In fact, it is one of the most important stages and more effort should be put in it. Generally,
the loss caused by a deployed faulty system is greater than the amount of investment needed to do a more complete
testing to avoid such problem. This is especially true in critical systems where the damage is irreversible, like loss of
human lives and environmental damages.
22
Bibliography
[Jor02] Paul C. Jorgensen. Software Testing A Craftsmans Approach. CRC Press, second edition, 2002.
[KFN99] Cem Kaner, Jack Falk, and Hung Quoc Nguyen. Testing Computer Software. John Wiley & Sons Inc,
second edition, 1999.
[Pez05] Mauro Pezze. Quality control of softwares. Chapter 18: Integration and Component-based Software Testing,
2005.
[Sch06] Holger Schlingloff, 2006. Handouts for Advanced Topics in Computer Science: Testing.
23