Software Reliability Measurement
Software Reliability Measurement
zyxwvutsrqponm
zyxwvutsrqp
IEEE JOURNAL ON SELECTED A R E A S IN COMMUNICATIONS. VOL. 8. NO. 2 , FEBRUARY 1990 247
Abstract-There has been a good deal of interest recently in the velopment? What can I expect from such measurement
emerging discipline of software reliability engineering. This interest and analysis?” The focus of this paper is on developing
zyx
can be attributable to the growing amount that software reliability con-
tributes to the overall reliability of products being built. This paper
an appreciation of where and how measurement and anal-
describes the measurement and analysis aspects of software reliability ysis of software reliability fit into product development.
and is intended to provide software engineers and managers a feeling The specifics on measurement and analysis techniques
of where and bow software reliability measurement can be applied to which would require much more space and time to cover
their projects. It first provides some background for understanding are left to other references such as [4]and [5, ch. 51.
software reliability measurement, and then discusses activities associ-
ated with measuring and analyzing software reliability in the context
The remainder of this paper is divided into two sec-
of the software product life cycle. tions. The first section provides some background on soft-
ware reliability measurement. First, the notions of fail-
ures and faults are contrasted. Then some basic software
INTRODUCTION A N D BACKGROUND
reliability concepts are described. A brief description of
zyxwvut
reliability?” In the past, reliability has been equated to measurement and analysis is to specify what the customer
hardware reliability. However, the amount that software needs in terms of reliability before a software product is
failures contribute to overall product reliability has been built, to validate that these needs are met before delivery
increasing. For example, a survey of customers of one of the product to the customer, and to track and monitor
AT&T operations support system product indicated that that the customer’s needs continue to be met after deliv-
as many as 4 of 5 failures at customer sites were software ery.
related. In addition, the ratio of software to hardware fail-
ures is growing to the point where software failures are Failure versus Fault
beginning to predominate. One study summarizing digital Keeping the distinction between failures and faults is
switch reported problems between 1984 and 1988 indi- important in applying software reliability. Since software
zyxwvutsrq
cated the number of software and hardware problems reliability is integrally associated with software failures,
being nearly equal. Yet a more recent update of such re- any discussion of software reliability must start with a def-
ported problems covering the period March 1988 to March inition of software failure. A formal definition of a soft-
1989 shows the proportion of software reported problems ware failure is a departure of the operation of a software
increasing to two and one-half times the number of hard- system from its requirements. Generally, the require-
ware reported problems. ’ ments are assumed to be explicitly stated in a Require-
The intent of this paper is to describe the measurement ments Specification. However, in applying software reli-
and analysis aspects of software reliability engineering. It ability, the notion of requirements may be extended to
is intended for those software engineers and managers who cover anything relating to the customer’s satisfaction with
may be interested in such questions as: “What are the key the product. Examples of software failures might be sim-
notions of software reliability? How can the measurement ply the absence of a field in a generated output report. A
and analysis of software reliability fit into my product de- more serious software failure might result in the system
being completely inoperable but recoverable by reinitial-
Manuscript received May 22, 1989; revised September 22, 1989. izing the system software.
The author is with AT&T Bell Laboratories, Middletown, NJ 07748. On the other hand, a fault is an underlying defect in the
IEEE Log Number 8932099.
‘Some license is taken here in using reported problems as a measure of
number of failures (not all failures result in reported problems). However,
the intent is to illustrate the increasing importance of software reliability.
zyxwvut
software. Examples of a fault might simply be an unini-
tialized variable or an incorrectly coded program state-
ment. Other examples of faults would be an incorrect im-
plementation of a design or requirement specification. set of all the input states to the software together with the
Software failures are an external manifestation of the frequency with which they will occur in normal operation
presence of a fault. However, there is not necessarily a by the customer. In essence, the operational profile spec-
one-to-one correspondence between fault and failure. A ifies how a customer will use the software product in his
fault may result in many failures (if the fault is not re- or her operating environment ([3] describes an operational
moved after the first failure is encountered). Also, differ- profile for one software product).
ent faults may cause failures to occur at different rates.
On the other hand, a fault may cause no failure at all. This Software Reliability Modeling
would be the situation if the customer uses the software Models play an important role in estimating and ana-
product in a way that the fault is never encountered under lyzing software reliability. Software reliability models
the right conditions to cause a failure. generally have the following two components.
Customers are not concerned per se with how many The execution time component, which relates the
faults there are in a software product. They are concerned failure of software to execution time, properties of the
with how often the software will fail for their intended software being developed, properties of the development
use and how costly each failure will be to them. environment, and properties of the operating environ-
Measures related to faults and fault density (i.e., the ment. With regard to the execution time component, there
number of faults per thousand lines) have been and will are two general classes of models:
continue to be an important in-process metric for software -reliability growth models (so named because reli-
development. In fact, striving to reduce the number of ability improves with execution time)-these models are
faults introduced in software during a particular stage of particularly applicable during system test because the re-
zyxwvutsrqp
development and seeking ways of isolating and removing moval of faults reduces the rate at which failures occur;
faults as software moves from development stage to stage and
can go a long way in improving its reliability. However, -constant reliability models (so named because re-
to measure reliability of the software from the customer’s liability does not change with execution time)-these
perspective requires measures related to failures and rates models are suitable after the product is introduced in the
at which failures occur. field when no fault removal is generally occurring.
The calendar time component, which relates the pas-
Some Basic Concepts sage of calendar time to execution time and the number
There are two important concepts in measuring soft- of failures that have occurred.
ware reliability. First, the rate at which software fails is The passage of calendar time depends on the consump-
a function of execution time. Execution time is a measure tion of resources such as processing cycles on a computer
of software processing time or CPU time and is generally system as a function of execution time and number of fail-
expressed in terms of CPU-seconds or CPU-hours. To un- ures. During system test, other resources such as testers’
derstand this relationship between failures and execution time in detecting failures, and developers’ time in locat-
time, consider an extreme situation where a software ing and fixing the underlying faults, come into play in
product is loaded on a computer system but is not pro- relating calendar time to execution time and number of
cessing. No matter how much elapsed time (what we refer failures. Reference [4] provides a review of existing
to as calendar time) transpires, no failures will occur be- models and illustrates the use of one set of models in ana-
cause the software is doing no processing (and hence the lyzing the reliability of software in a number of situations.
execution time is zero). As the software processes more
and more (and hence the execution time increases), the Software versus Hardware Reliability
probability increases of first traversing a path in the code There are some fundamental differences between hard-
that contains a fault and, second, just the right conditions ware and software failures that impact the analysis of
prevail for the fault to cause a failure. There are times hardware versus software reliability. The primary differ-
when CPU time (and hence execution time) cannot be ence is in the underlying mechanisms causing failures.
measured directly. However, under appropriate condi- With hardware, failures are generally caused by physical
tions, execution time can be approximated in other ways. processes related to stresses imposed by the operating en-
For example, if the utilization of the CPU is relatively vironment. Specifically, failures are due to components
constant, then elapsed (or calendar) time multiplied by degrading, deteriorating, or being subjected to environ-
average utilization is a good approximation to execution mental shocks.
time. In another instance, if the mix of types of items With software, there is nothing to “wear out.” The pri-
(e.g., transactions, jobs, commands) being processed is mary mechanism for failures are the latent faults within
relatively constant, then execution time is relatively pro- the software. These faults may be the result of errors in
portional to the number of items processed. The important coding or implementation of design or requirements spec-
point to make here is the modeling and analysis of soft- ifications. However, just the presence of a fault is not
ware reliability is done in the execution time domain. enough to cause a failure. First the software must be ex-
Second, the operational profile plays a large role in how ecuting, and second the input being processed during ex-
frequently the software fails. An operational projile is the ecution must be such that the fault will be encountered
EVERETT: SOFTWARE RELIABILITY MEASUREMENT zyxwvutsrqpon
under just the right set of conditions that result in a fail- definition phase is a requirement specification for the
249
ure.2 product.
Just as there are differences in the mechanisms for fail- Reliability objectives should be included as an explicit
ures, there are some major aspects in which the analysis part of the requirements specification. The first step in
of software reliability differs from the analysis of hard- setting reliability objectives is to define what a failure is
ware reliability. An example would be the notion that from the customer’s perspective. Next, failures should be
adding redundancy of components improves reliability. categorized by the impact they have on customers. A key
Whereas duplicating hardware components will generally step is understanding customers’ tolerance to failures of
improve reliability, duplicating software components will different categories and customers’ willingness to pay for
not generally improve reliability. This phenomenon in reduced failure rates in each failure category. Looking at
software is due to the fact that encountering a fault under customers’ experiences with past and existing products
a given set of conditions that result in a failure of one of will help in determining these tolerance levels. Another
the software components will result in the same failure in step is assessing the reliability capabilities of products de-
the duplicate component. The only time this may not oc- veloped by competitors. The information developed in
cur is when the software components are “indepen- each of these steps can then be used to develop reliability
dently” developed and thus are not exactly duplicated. objectives for the product. After such objectives are es-
On the other hand, there are a number of similarities in tablished for the product as a whole, they must be allo-
the analysis of hardware and software reliability. These cated among the hardware and software components
similarities allow the two reliabilities to be combined within the product. In effect, a reliability budget is being
using reliability block diagram analysis to compute an established for each of the components within the prod-
zyxwvut
overall system reliability for a product. More complicated uct.
models may be needed if there is significant hardware- Once reliability objectives are established for software
software dependence. Chapter 4 of [4] illustrates how components, the following two important items are needed
hardware and software reliability can be combined to de- to proceed.
termine system reliability. An Operational Projle that Rejects How the Cus-
tomer Will Use the Product:3 An indicated earlier, the re-
SOFTWARE PRODUCT LIFE CYCLEACTIVITIES liability of the product will in general depend on the op-
erational profile and therefore must be included as part of
zyxwvutsrqpo
The remainder of this paper will discuss software reli- the reliability specification. The operational profile in ef-
ability measurement activities that should be carried out fect defines a “field of use’’4 for the product being de-
during product development. These activities will be dis- veloped. This involves a tradeoff. Establishing a wide
cussed in the context of the various phases of the product “field of use” for the product may require multiple op-
zyxw
life cycle. This paper divides the product life cycle into erational profiles to fully cover the field of use of the
four phases: product dejnition, product design and im- product. This is turn will increase the cost of developing
plementation, product validation, and product operation the product. For example, the costs for testing the product
and maintenance. Following is a brief description of the will increase, as reliability objectives must now be certi-
primary purpose of each of the life cycle phases. The fied under more than one set of operating conditions.
product definition phase focuses on defining what the cus- Estimates Relating Calendar Time ro Execution Time:
tomer needs and developing a requirements specification Because the rate at which software failures occur is a
for the product that addresses those needs. In the design function of execution time, such estimates are needed to
and implementation phases, designs are rendered using translate calendar time to execution time. Specifically,
the product requirements, and these in turn are imple- these estimates will allow reliability objectives expressed
mented into the software product. In the validation phase, in terms of calendar time (in a form customers can relate
the software product is certified that it meets the require- to) to be translated into reliability objectives expressed in
ments through system testing and field trials. Finally, dur- terms of execution time (in a form that can be measured
ing the operation and maintenance phase, the product is and analyzed).
delivered to and used by the customer. Problems reported Another activity in setting reliability objectives is trad-
by the customer may lead to “software fixes” being de-
ing off product reliability, product cost, and product de-
veloped by the maintenance staff and delivered back to livery time frame (see [4, ch. 71 for examples). Increased
the customer.
’One approach in characterizing the operational profile is to analyze the
Activities During the Dejnition Phase . _that desired features will be used by the customer. In employing
freauency
proper product definition is essential for having a sue- the techniques of Quality Function Deployment. one should get customers
to quantify how often they anticipate a desired feature will be used in their
cessful product on the market. The primary output of the environment. Getting customers to estimate the “frequency
. . of use“ of fea-
tures is not only useful in characterizing the operational profile. but i t is
also useful in helping customers to rank the features they desire.
‘Some hardware failures may also be caused by latent design faults which ‘The term “field of use” as used here is borrowed from hardware prod-
would be analogous to the underlying mechanism causing software fail- ucts. It effectively indicates the operating environments in which the prod-
ures. uct was intended to be used.
250 zyxwvutsrqponml
zyxwv
reliability generally means increased development costs
IEEE JOURNAL ON SELECTED A R E A S I N COMMUNICATIONS. VOL. 8, NO. 2 . FEBRUARY 1990
and time. It also means increased costs and time for test-
Activities During the Design and Implementation Phases
The primary purpose of the design and implementation
ing the product. An interesting note here is that the cost phases is to turn a requirement specification for a product
and time for testing products is becoming a major, if not into a design specification, and then to implement the de-
dominant, component in the overall development cost and sign into the product. One activity during the design phase
time. Some estimates of the development costs devoted to is allocating and budgeting software reliability objectives
testing and field trials run as high as 40%. Additionally, among software components. An analysis should be con-
the testability of the reliability requirements should be re- ducted to ascertain whether the reliability budget can be
viewed during product definition, as this may have a large attained within the proposed design.
impact on the cost of testing the product. Another important activity should be to certify the re-
Increases in development costs and times equate to sim- liability of “reused’’ software (not only application soft-
ilar increases in the price of the product to the customer ware but also system software such as operating system
and to delays in when the customer will receive the prod- software and communication interface software). There is
uct. The point here is customers will be making tradeoffs a strong push to reuse software developed for one product
zyxwvutsrq
among reliability, price, and when the product will be within another product. However, the reliability of the
available in selecting the products they will buy. Models reused software may not be sufficient to satisfy the needs
can help in making tradeoffs with regard to cost, reliabil- of the new product. Before reusing software in a new
zyxwvutsrq
ity, and delivery time frame. product, its reliability should be established through re-
Models applied at this stage of development should have liability testing under the operating conditions intended
strong “descriptive validity, ’” as the parameters of the for the new product. It is important to use the operational
model must be estimated from the characteristics of the profile for the new product in reliability testing. The re-
development and system test environment and the soft- liability of the software using the operational profile for
ware being developed. Chapters 5 and 7 in [4] describe the original product may be significantly different from
how certain models can be calibrated and applied prior to the reliability using the operational profile for the new
system test. product. Once the reliability of the reused software com-
Two questions often asked about applying models dur- ponents is established under operating conditions in-
ing this phase are: tended for the new product, then software reliability ob-
How accurate are reliability predictions made at jectives can be allocated among the new software
this point of product development? components in a way that overall reliability objectives are
Is the effort to model reliability at this point worth met.
it? The operational profile itself provides an indirect ben-
With regard to the first question, there is not enough efit during the design phase. Developers can use infor-
information to give a definitive answer. However, the ac- mation on the frequency of use of different features to
curacy of predictions is most likely what can be expected weigh alternatives. In addition, because the operational
for predictions of hardware reliability models and soft- profile limits the field of use for the product, developers
ware development effort models (for example, the CO- can produce simpler designs to satisfy reliability require-
COMO model discussed in [ 11) applied during this phase. ments. For example, a designer might opt for a simpler
For these latter models, errors of 100% order of magni- manual recovery design for an operating condition that
tude have been experienced. In answer to the second occurs very infrequently rather than a more complex au-
question, if the modeling were carried no further than just tomated recovery design. In general, simpler designs lead
“predicting software reliability,” then the effort would in turn to a more reliable product. In addition, the oper-
probably not be worth it. However, the modeling should ational profile indicates “high usage” features that de-
continue into the design, implementation, validation, op- signers can focus their energies on to maximize customer
erations, and maintenance phases of the software product satisfaction.
life cycle. Especially during the validation and operations There are many verification activities that should be
phases, modeling will help in structuring a plan for col- going on during the implementation phase. Verification
lecting reliability measurement data. The collected data activities such as inspections, unit testing, and integration
can be used not only to track that reliability objectives are testing are intended to “verify” that what was actually
being met, but also can be used to refine the models for produced within a development stage is the same as what
subsequent releases of software products. The real utility
zyxwvu
was intended to be produced. Verification activities, such
of modeling is to get into this cycle of model, measure, as inspections, can be applied to such development stages
zyxwvutsrqponmlk
and refine. Moreover, applying models in this point of the as specifying requirements, design, and coding (unit im-
life cycle forces projects to focus up front on issues re- plementation). The intention of verification activities is to
lated to reliability and to baseline assumptions about the minimize the propagation of the number of faults intro-
reliability of the product before development begins. duced in one development stage to another stage. At this
’I he “descriptive validitv“ o f a model is its ability to be “explained“ time, there is little that can be done to estimate software
by the underlying process being modeled. reliability using measures from such product verification
EVERETT: SOFTWARE RELlABlLlTY MEASUREMENT zyx
zyxwvutsrqponm
activities as unit test, integration test, or inspections. This
is due in large part to the processing frequency of input
For those organizations that have operations responsi-
bility for software products, software reliability measure-
25 1
states during unit and integration testing or the pseudo- ments can be useful to monitor the reliability of the op-
processing frequency of input states during inspections. erating software. The results can be used to determine if
In both cases, the processing frequencies do not match there is any degradation in reliability that is occurring over
those in the operational profile. time (for example, degradation caused by the introduction
of software fixes into the operating environment). Soft-
ware reliability measurements can also be used in decid-
Activities During the Validation Phase
ing on when to incorporate new software releases into the
The primary thrust of validation activities is to certify operating environment. The reliability of the currently
the product is suitable for customer use. Product valida- operating software should be great enough so that intro-
tion is associated with evaluating a software product at ducing the new software release will not reduce the reli-
the end of development to ensure it complies with the ini- ability below a “tolerable” level for the end user of the
tial requirements for the product. Product validation ac- software product.
tivities for software products generally include system test Software reliability measures are useful to those who
and field trial activities. have field support responsibilities for the released prod-
Software reliability measurements are particularly use- uct. Failure data collected in the customer’s operating en-
ful in this phase in conjunction with reliability testing to vironment can be used to verify the customer’s perceived
monitor the progress of testing and to help in making level of reliability to the measured reliability of the prod-
product release decisions. Reliability testing (also re- uct. A number of items could be contributing factors to
ferred to as longevity or stability testing) is conducted to differences in customers’ perceived level of reliability and
zyxwvuts
ensure that reliability objectives are met. During reliabil- the measured reliability of the product.
ity testing, input states are generally executed with a fre- The definition of what the customer perceives as fail-
quency that matches what is specified in the operational ures is different from the definition used in testing the
profile. This type of testing provides a nice alternative product.
way of doing regression testing, particularly when the The operational profile for the customer environment
number of regression test cases becomes large. In a way, is significantly different from the operational profile used
this amounts to using statistical sampling to select regres- in testing the product.
sion test cases based on the frequency with which they The test environment does not reflect the customer
would occur in field use. environment in one or more key areas.’
The sequence of activities during testing typically pro- The original reliability objectives set for the product
ceeds as follows. do not truly reflect what the customer originally desired
Failures and the corresponding execution time (from or the customer’s desires have changed.
the start of testing) are recorded during testing. Determining what factors contribute to the differences
Statistical techniques are used to estimate the param- and feeding information back to the appropriate devel-
eters of software reliability execution time model com- opment phase is an important task.
ponents based on the recorded failure data.6 Software reliability measurement and modeling are also
Using the execution time component with the esti- useful for maintenance activities. One prime example is
mated parameters, such useful quantities as present fail- using the frequency and cost of failures to rank the order
ure intensity and the remaining execution time can be es- for repairing underlying faults. Software reliability models
timated. can also be used to size maintenance staff for “repairing”
Finally, the calendar time component of the models faults reported from field sites (see [2]). However, the
can then be applied to estimate remaining calendar time most important activity that can be conducted at this time
needed for testing to achieve a desired failure intensity is to conduct a “Root Cause Analysis” of field-reported
objective. faults.’ The intent of such an analysis is to determine what
stages of the development process such faults are intro-
duced and what changes should be made to the develop-
Activities During the Operations and Maintenance
ment process to reduce the probability of similar faults
Phases
being introduced in that stage or propagating to later
The primary thrust of the operations and maintenance stages.
phases is to move the product into the customers’ day-to-
day operations, to support the customer in the use of the
product, and to repair or fix faults within the software that ’For example, simulators may have been used in a test environment to
replace actual hardware components that a product is to interface with. This
zyxwvutsrqponm
are impacting the customers’ use of the product. may have been done because of the high cost of including the actual hard-
ware components in the test environment. However. the simulators may
not faithfully reflect the operation of the component that is being replaced.
‘Strictly speaking, faults uncovered during product validation activities
‘These are discussed in 14. ch. 121 such as system test and field trials should also be included.
252
zyxwvutsrq
zyxwvutsrqpon
zyxwvutsrqpon
zyxwvutsrqponml REFERENCES
[ I ] B . U'. Boehm. Sofiiwrc, Erigiriwririg E w r i o r i i i c s .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 8. NO. 2. FEBRUARY 1990
Englewood Cliffs,
William W. Everett (M'88) received the Engi-
neer's degree from the Colorado School of Mines
in 1965 and the Ph.D. degree in applied mathe-
zyxwvutsrqponmlk
zyxwvutsrqponmlkjihgfedcba
zyxwvutsrqponml
N J : Prcntice-Hall. 1981. matics from the California Institute of Technol-
171 D A. Chri\tenson. "Using wftware reliability models to predict field ogy in 1970.
tailure rates i n electronic switching systems." in Pro<.. Aririu. N u t . He is a Distinguished Member of Technical
. l o i t i t Coti,f. SO/IIWJ-CQ i i t r l i t ? urrd Pr-oc/icc.ri\,irJ. Washington. DC, Mar. Staff at AT&T Bell Laboratories. His areas of in-
1-3. 1988. terest include software engineering, software re-
(31 W . W . Evcrctt. "Experienccs in applying software reliability to net- liability, and computer systems performance anal-
uork operation\ \y\tems." in P r o < . 4'rh Aririil. N u t . Joirit Cori,f Soft- ysis.
I W J - C @ur/in orid Pt-oductii~itv.Washington. DC, Mar. 1-3. 1988. Dr. Everett is a member of the Society of In-
141 J . D . Musa. A . lannino. and K . O k u m o t o . Scifriturc, R c / i u b i / i t ~ - M r ~ i - dustrial and Applied Mzithematics (SIAM).
.~icrc'tiicrit. Prc,t/ictiori. Applicntiori . New York: McCraw-Hill. 1987.
[SI M. L. Shooman. Sofln~irc'Eri~girrc~er-iri~-DesiR,1, Reliubiliry t i r i d Muti-
o,g.c'irrc'rir. N e u Y o r k : McCraw-Hill. 1983.