Kim2009
Kim2009
Kim2009
Abstract: Reliable forecasting is instrumental in successful project management. In order to ensure the successful completion of a
project, the project manager constantly monitors actual performance and updates the current predictions of project duration and cost at
Downloaded from ascelibrary.org by University of Michigan on 02/18/13. Copyright ASCE. For personal use only; all rights reserved.
completion. This study introduces a new probabilistic forecasting method for schedule performance control and risk management of
on-going projects. The Bayesian betaS-curve method 共BBM兲 is based on Bayesian inference and the beta distribution. The BBM provides
confidence bounds on predictions, which can be used to determine the range of potential outcomes and the probability of success.
Furthermore, it can be applied from the outset of a project by integrating prior performance information 共i.e., the original estimate of
project duration兲 with observations of new actual performance. A comparative study reveals that the BBM provides, early in the project,
much more accurate forecasts than the earned value method or the earned schedule method and as accurate forecasts as the critical path
method without analyzing activity-level technical data.
DOI: 10.1061/共ASCE兲0733-9364共2009兲135:3共178兲
CE Database subject headings: Forecasting; Scheduling; Bayesian analysis; Construction management.
regression model that fits an S-curve function to cumulative earned value method. In EVM, the schedule and cost performance
progress curves of a project and updates the parameter estimates of a project are analyzed in terms of four performance indicators:
of the S-curve using a Bayesian inference approach. Recently, 共1兲 cost variance 共CV兲 = EV− AC; 共2兲 cost performance index
two new models based on this concept have been reported 共Gar- 共CPI兲 = EV/ AC; 共3兲 schedule variance 共SV兲 = EV− PV; and 共4兲
doni et al. 2007; Kim and Reinschmidt 2007兲. In those models, schedule performance index 共SPI兲 = EV/ PV. The standard EVM
actual performance records are fitted by a single or a group of prediction for the project duration and cost at completion rests on
rather simple S-curve functions with two parameters. As a result, the assumption that the cumulative performance indices 关SPI共t兲
their efficiency depends on the existence of S-curve functions that = EV共t兲 / PV共t兲 and CPI共t兲 = EV共t兲 / AC共t兲兴 will represent the per-
fit a specific progress pattern with an acceptable accuracy. The formance efficiency of the jobs in the future. The estimate at
BBM in this paper exploits the potential of the combined use of completion 共EAC兲 at time t is then equal to the cost already spent
Bayesian inference and S-curve functions in project performance 共AC兲 plus the adjusted cost for the remaining work
forecasting, with emphasis on ease of implementation and robust-
ness in quantifying various prior performance information from EAC共t兲 = AC共t兲 + 关BAC − EV共t兲兴/CPI共t兲 = BAC/CPI共t兲 共1兲
various sources, such as project plans, historical data, and subjec-
where BAC= budget at completion. Although the use of EVM
tive judgments.
forecasting formulas for cost performance has been supported
This paper is organized as follows. The next section reviews
widely, many modified versions of the standard formula have
conventional project performance forecasting methods. In the
been suggested 共Anbari 2003; Christensen 1993兲. However, those
subsequent section, two component methodologies—S-curve
methods are typically linear extrapolations, assuming, for ex-
models and Bayesian inference—are reviewed. The BBM is then
ample, that the computed CPI, which has changed in the past 共or
formulated based on the general framework of the Bayesian adap-
it would always equal 1.00兲, will not change in the future.
tive forecasting method and a curve fitting technique using the
On the other hand, forecasting project duration using the cu-
beta distribution. Numerical examples using real project data are
mulative planned value 共PV兲 and earned value 共EV兲 has been
presented in order to demonstrate the predictive properties of the
criticized for systematic distortion in results 共Lipke 2003; Vande-
BBM. In addition, the life-cycle forecasting accuracy of the BBM
voorde and Vanhoucke 2006兲. To improve schedule forecasting
is evaluated and compared against a state-of-the-art EVM sched-
with EVM, several modified forecasting formulas have been sug-
ule forecasting method and the CPM.
gested 共Anbari 2003; Lipke 2003兲. Recently, Vanhoucke and
Vandevoorde 共2007兲 conducted a comprehensive comparative
study against three EVM-based schedule forecasting methods in
Review of Project Performance Forecasting the literature and reported that the earned schedule method 共Lipke
Methods
2003兲 outperforms, on the average, other methods. In the earned
schedule method 共ESM兲, the earned schedule at time t, ES共t兲, is
Typical forecasting approaches to update the original estimates
defined as the planned time to achieve the current earned value
with actual performance data from an ongoing project can be
EV共t兲. The estimated duration at completion at time t, EDAC共t兲,
grouped into three categories, depending on the decision maker’s
is then calculated as
perception of the relationship between past and future perfor-
mance 共PMI 2004兲. Table 1 summarizes basic properties and ex-
EDAC共t兲 = t + 关PD − ES共t兲兴/共ES共t兲/t兲 共2兲
amples of the three categories. Approaches in Categories I and II
are valid only when actual performance data observed from a where PD= planned project duration.
project are considered irrelevant to the future performance of re- However, application of these present methods to schedule
maining jobs. The Category I approach can be applied when the forecasting of ongoing projects has some limitations. First, all
original estimates are still believed reliable. For example, the these methods are deterministic and fail to provide confidence
CPM updates the project duration at completion given delays in bounds on predictions. The level of uncertainty in forecasts may
some critical path activities in the past, but typically assumes that influence decisions about planning and controlling projects. In
the causes that affected past activities will not affect future ones. addition, most methods forecast the final outcome of a project
When the remaining jobs are considered as a new project, the based on the assumption that the current status measurements are
Category II approach can be applied. accurate and without any errors, which is unrealistic because
Category III methods address the situations in which project there are inherent errors in both measuring time to report and
duration and cost at completion are updated using both the origi- measuring performance at the reporting times. Furthermore, EVM
nal estimate and actual performance data up to the time of fore- forecasting formulas are not recommended early in a project be-
casting. A typical case of the Category III forecasting is the cause of large prediction errors due to few reporting intervals to
B共␣,兲 共B − A兲␣+−1
t PDAC
共4兲
Fig. 1. Two elements of the prior performance information and the
where B共␣ , 兲⫽beta function
actual performance data
冕
1
B共␣,兲 = t␣−1共1 − t兲−1dt 共5兲
0 performance information is defined as all relevant performance
information other than actual performance data, which is avail-
The cumulative distribution function of the beta distribution is able even before the inception of a project.
F共x;␣,兲 = B共共x − A兲/共B − A兲;␣,兲/B共␣,兲 共6兲 In the BBM, the prior performance information consists of two
elements: the prior probability distribution of project duration and
where B共共x − A兲 / 共B − A兲 ; ␣ , 兲⫽incomplete beta function, which the progress curve template. Fig. 1 shows these elements in a
is defined as graphical way. First, the prior distribution of project duration rep-
冕
s resents the best probabilistic estimate of the project duration,
B共s;␣,兲 = t␣−1共1 − t兲−1dt 共7兲 which is made before observing actual performance data. Com-
0 mon probabilistic schedule planning methods such as PERT 共Mal-
colm et al. 1959兲 and network-based simulation 共Lee 2005兲 can
The betaS-curve function is defined over a range 关0 , T兴, where be used to generate the distribution of project duration. In the
T represents the project duration. The ranges of the two shape absence of detailed project plans, a subjective probability estimate
parameters are restricted to ␣ 艌 1 and  艌 1 in order to confine the can be made using subjective estimating methods such as three-
plausible solution space to S-curves with unimodal PDF. The uni- point estimates and range estimating techniques. The second ele-
modal shape resembles the typical resource level distribution of ment of prior performance information, the progress curve
projects during the execution period. It should be noted that a template, represents the prior knowledge of the project manager
uniform PDF based on ␣ = 1 and  = 1 is also included in the and project engineers about the plausible progress pattern of the
solution space. In addition, a new parameter m is introduced to actual performance. The progress curve template of a project rep-
represent the location of the mode. The betaS-curve function is resents the characteristics of the project in terms of the cumula-
then defined as tive progress over time.
B共x/T;␣,兲
BetaS-curve共x;␣,m,T兲 = 共8兲
B共␣,兲 Step 1—Generation of Prior Distributions of
BetaS-Curve Parameters
where ␣ 艌 1, 0 ⬍ m ⬍ 1, T ⬎ 0, and  = 共␣ − 1兲 / m − 共␣ − 2兲.
In the Bayesian approach, the three parameters 共␣, m, and T兲 of
the betaS-curve function are not single valued and deterministic,
Input Elements
but rather are random variables themselves, with their own prob-
The primary information that should be relied on for project per- ability distributions. Depending on the types of prior information
formance forecasting is the past performance data observed in the available and the level of confidence a decision maker puts on
project itself. Early in a project, however, project managers may that information, different types of priors can be used for the
suffer from a lack of sufficient actual performance data to make betaS-curve parameters. Table 2 summarized the types of prior
reliable forecasts, resulting in deferring any judgment about per- distributions recommended in the Bayesian betaS-curve method.
formance control at the risk of missing the opportune time to take Because the duration parameter 共T兲 explicitly represents the
appropriate corrective actions. Therefore, for a method based on project duration, probabilistic, not single-point, priors should be
actual performance data to be also applicable during the early used in order to get probabilistic predictions of the completion
phase of a project, the method should have an adaptive nature that date. When reliable probabilistic estimates of project duration are
updates an original estimate developed during the planning phase available prior to the start of a project, informative prior distribu-
in the light of new performance data reported periodically during tion of T can be used. Otherwise, a noninformative prior should
the execution phase. The Bayesian betaS-curve method makes use be used for parameter T. In this case, predictions are made based
of all relevant performance information available from standard only on actual performance reports from the project.
project management practices and theories. Information used in For the shape parameters, ␣ and m, both single-point estimates
the method can be grouped into two categories: the prior perfor- and probabilistic estimates can be used, depending on the types of
mance information and the actual performance data. The prior information used in forecasting. For example, when a single
planned progress curve is used, single-point estimates of ␣ and m parameters that make the deviations normally distributed with
can be obtained by a common curve fitting technique. This ap- zero mean and standard deviation . It is assumed that the ran-
proach is fairly simple and applicable to most projects that are dom errors corresponding to different observations are uncorre-
planned and monitored with cumulative progress curves. When lated. The likelihood of the data conditional on the parameters
there is a network schedule and the project manager can extract can then be calculated as the product of the likelihood of each
probabilistic estimates of activity durations, a more refined ap- observation
proach can be used to build probability distributions of ␣ and m N
共Kim 2007兲. The method proceeds through three steps: 共1兲 given
a network schedule and probabilistic estimates of activity dura-
p共D兩⌰兲 = 兿
i=1
p共ti,wi兩⌰兲
tions, generate a large sample of potential progress curves, which
N
are collectively called stochastic S-curves of the project 共Barraza
et al. 2000, 2004兲, using a simulation approach; 共2兲 for each of the = 兿 共1/冑2兲exp关− 共1/2兲共关ti − T共wi兩⌰兲兴/兲2兴 共11兲
i=1
stochastic S-curves, calculate the best-fit parameters 共␣ , m , T兲
using a betaS-curve function; 共3兲 repeat this fitting process to all The value of is determined by the forecaster to adjust the sen-
the random S-curves generated and obtain a set of marginal sitivity of predictions to the actual data reported.
probability distributions of the shape parameters— The marginal distribution of the observed actual progress D is
p共␣兲 , p共m兲 , p共T兲—along with the correlation coefficients between determined from
冕
them. With this method, the stochastic nature of the progress
curves of a project can be quantified in a systematic way and
p共D兲 = p共D,⌰兲d⌰ 共12兲
represented as a set of prior probability distributions of the betaS-
curve parameters. It should be noted that information used for
prior distribution generation is not limited to documented plans where the joint probability distribution of data and parameters is
for a specific project. Historical data from similar projects contain constructed from Eqs. 共9兲 and 共11兲 as p共D , ⌰兲 = p共D 兩 ⌰兲p共⌰兲.
valuable information about possible outcomes of a future project The goal of the Bayesian updating is to obtain a revised or
and, adjusted with appropriate professional judgments, a project posterior marginal distribution of each model parameter condi-
manager may be able to develop useful prior distributions. tional on the observed data. Using fundamental properties of con-
ditional distributions, the posterior marginal distribution of a
parameter, for example ␣, can be derived by integrating the joint
Step 2—Parameter Updating with Bayesian Inference parameter distribution conditional on the observed data, which is
Because the shape parameters are random variables themselves, determined from Eqs. 共9兲–共12兲, with respect to the remaining pa-
estimates of their numerical values can and should be revised rameters ⌰−␣ = 兵m , T其
冕
whenever any new information becomes available. With the
betaS-curve function, the Bayesian updating process in Eq. 共3兲 is p共␣兩D兲 = p共⌰兩D兲d⌰−␣ 共13兲
formulated as follows. Let ⌰ denote the set of parameters
兵␣ , m , T其. The parameters are chosen independently so that the It should be noted that computing the posterior distributions
prior probability distribution of the parameter set is represented as derived above requires multifold integration over the parameters
p共⌰兲 = p共␣兲p共m兲p共T兲 共9兲 used in the analysis. In this work, a Monte Carlo integration tech-
nique has been successfully applied without resorting to more
Once a project gets started, actual progress is reported periodi- sophisticated methods such as importance sampling 共Gardoni et
cally and the data can be represented as a series of discrete values al. 2007兲 and Markov chain Monte Carlo method 共Ghosh et al.
D 2007兲 that consume large amounts of computer time.
D:共wi,ti兲,i = 1, . . . ,N 共10兲
where wi represents the cumulative progress reported at time ti; Numerical Examples
and N = number of records up to the time of forecasting.
The likelihood of the data conditional on the parameters cho- The Bayesian betaS-curve method has been implemented for the
sen is measured based on the deviations between the actual times convenience of the spreadsheet user as an add-in for Microsoft
of performance reporting and the planned times determined by the Excel and applied here to two examples of ongoing projects being
betaS-curve parameters T共wi 兩 ⌰兲. The goal is to seek a set of monitored with monthly progress reports. Fig. 2 shows the
% Complete
% Complete
70 Planned, 24 70
60 60 Planned, 25
50 50
40 Planned 40
Planned
30 30
20 Actual 20 Actual
10 Planned F it 10 Planned F it
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Month Month
planned progress 共planned value or budgeted cost of work sched- mean of the posterior distribution of the EDAC over the forecast-
uled兲 and actual progress 共earned value or budgeted cost of work ing time. The upper and lower bounds are determined at the 10%
performed兲 curves from the monthly progress reports of the ex- confidence level on each side. Therefore, the confidence bounds
ample projects. Project A is an engineering project with a budget have 80% probability of including the actual project duration.
over 25 million US dollars. The planned duration of Project A is That is, a forecasting method is derived such that the lower bound
24 months and it is slightly ahead of schedule with 58% comple- indicates a completion date that a project will finish later than
tion as of the eleventh month. Project A represents a case of with 90% probability.
on-the-schedule projects. On the contrary, Project B represents a Forecasts made for Project A are shown in Fig. 3. The results
typical case when a project suffers perpetual schedule delay. reveal that the use of informative prior for the duration parameter
Project B is a plant project with over 11 million US dollar budget 关Fig. 3共b兲兴 provides a more stable EDAC profile than the nonin-
and its network schedule consists of over 1,000 activities. Origi- formative T. On the other hand, when the noninformative prior is
nally scheduled to finish within 25 months, Project B is 68% used 关Fig. 3共a兲兴, the mean of the EDAC responds quickly to early
complete as of the eighteenth month. project reports. The results indicate that the same actual data have
The primary objective of these examples is to demonstrate a stronger influence on the revision of EDAC when a noninfor-
advanced properties of BBM, such as the range of possible mative prior is used. However, influence from prior information
completion dates and potential advantages from using appropriate diminishes as more data accrue from the project. The diminishing
prior performance information in combination with actual perfor- impact of prior information is also found in the profiles of the
mance data. In order to do that, two different prior distributions prediction intervals. Results in Fig. 3 show that the width of the
are used for the project duration parameter T and the estimated prediction interval narrows, as it should, as more data are ob-
duration at completion 共EDAC兲 is updated every month with a served. However, the rate of narrowing is greater with noninfor-
new monthly progress report. The prior distributions used for the mative prior than PERT estimates of T.
betaS-curve parameters are summarized in Table 3. The informa- The results for Project B 共Fig. 4兲 show similar patterns dis-
tive prior for project duration is determined from subjective judg- cussed with Project A and, more importantly, demonstrate the
ment of possible outcomes of project duration using the PERT benefits of using BBM when a project is behind schedule. Obvi-
range estimates. In this paper, planned project duration is used as ously, this is the case when project managers start to keep a close
the most likely 共ML兲 estimate of the project duration, while the eye on their projects. In such cases, the prediction bounds on the
optimistic estimate 共O兲 and pessimistic estimate 共P兲 of the project EDAC provide an objective, quantitative indicator of the signifi-
duration are assumed as 80% and 140%, respectively, of the cance of the gap between the planned progress and the actual. For
planned duration. For example, the PERT estimates of Project B example, the lower bounds in Figs. 4共a and b兲 indicate that,
are O = 20 months, ML= 25 months, and P = 35 months. On the 10 months after the project start, the probability of completing the
other hand, a uniform distribution spanning from zero to three project within its original duration 共25 months兲 is highly unlikely
times planned project duration is used to represent noninforma- 共falls below 10%兲. Of course, one can adjust the confidence level
tive prior distribution for project duration. For the shape param- according to his or her accepted level of risk, for example, to 95%
eters, ␣ and m, least-square estimates of the best-fit betaS-curve or to 99%. The point here is that the BBM method provides
function to the planned progress curves are used. project managers with additional information about schedule risk
The BBM is repeatedly applied to the two projects using the of their projects, which deterministic methods cannot convey.
two prior cases and the time histories of the EDAC are shown in It should be noted that a forecast can only be as accurate as the
Figs. 3 and 4. In the graphs, the thick solid line represents the information used. Whether one should use an informative prior or
EDAC
EDAC
25 Planned, 24 25 Planned, 24
20 20
15 15
0 5 10 15 20 25 0 5 10 15 20 25
Month Month
a noninformative prior depends on whether prior knowledge and reliable a prediction is at a specific completion point. In order to
beliefs available are considered important and useful in predicting get statistically meaningful results, each forecasting method
the performance of the current project. The greater the emphasis should be independently applied to a large set of sample projects
placed on prior knowledge, the less the emphasis on current data that represent diverse progress patterns in real projects 共Kim
when making project predictions. In practice, informative prior 2007; Vanhoucke and Vandevoorde 2007兲. Comparing different
for the three parameters of the betaS-curve model can be gener-
forecasting methods based on a limited number of projects does
ated with a variety of information sources such as similar projects
not prove much because with a small sample size it would almost
in the past, in-house project database, specific project plans, or
subjective judgment based on personal experience. It is a project always be possible to find some other contradicting cases. This
manager’s call to select the most appropriate information source would be like trying to determine the probability distribution of a
or to combine all information available using his or her own general class of coins by flipping only a couple of coins.
judgment. Although the writers relied on rough PERT estimates In this study, a large set of artificial projects are used to over-
for the informative prior distribution of the duration parameter, come limited and often incomplete real project data. More spe-
project managers in real projects may be able to extract valuable cifically, 30 projects with 200 activities are generated with a
information about the plausible distribution of all the three betaS- random network generation technique combined with a network-
curve parameters using other approaches discussed in previous based schedule simulation 共Barraza et al. 2000; Kim 2007兲. Indi-
sections. When reliable information is not available, noninforma- vidual projects have different network structures and different
tive prior should be used so that a prediction be made based only levels of schedule complexity. For each project, 100 “actual”
on actual performance data from the current project.
progress curves are simulated. Being randomly generated, these
3,000 sets of actual project data are unbiased and independent of
each other, representing the range of project outcomes that might
be realized in networks of 200 activities. In this example, a
Life-Cycle Accuracy Profile
network-based simulation technique is used to generate informa-
In this study, the BBM is compared with the earned schedule tive prior distributions for the duration parameter T.
method 关Eq. 共2兲兴 and CPM in terms of forecasting accuracy at In this paper, the life-cycle accuracy profiles of BBM, ESM,
different stages of a project. Such a life-cycle accuracy profile and CPM are measured with the mean absolute percentage error
provides the project manager with useful information about how 共MAPE兲, which is defined as a function of time t
40 EDAC(t) 40 EDAC(t)
EDAC
25 Planned, 25 25 Planned, 25
Overrun Overrun
Warning Warning
20 Point 20 Point
15 15
0 5 10 15 20 25 0 5 10 15 20 25
Month Month
冉 冊兺
N
100 兩APDUi − EDACi共t兲兩 these actions, the project gets back on its track at 18 months.
MAPE共t兲 = 共14兲 Does this prove that the forecast at month 10 is manifestly
N i=1 APDUi
wrong? Or was it—it was this forecast that inspired the project
where APDUi = actual project duration of random execution i, manager to take action, and without this action the project would
which is generated by a network-based schedule simulation ap- have been late. The only way in which the accuracy of predictive
proach. methods can be obtained would be if project managers took their
Fig. 5 shows the MAPE obtained from the 3,000 project ob- hands off the wheel and put the project on autopilot, and this is
servations at six evaluation points. Among the six evaluation not going to happen. The only claim for the predictive method
points, the first five evaluation points 共5%PD, 10%PD, 20%PD, proposed in this paper is that it squeezes more information out of
30%PD, 40%PD, and 90%C兲 correspond to the time points, at 5, a set of project data than other methods can, and the value of this
10, 20, 30, and 40% of the planned project duration. However, the is to provide better analyses to the project manager with which to
last evaluation point 共90%C兲 is determined at the time point when make, hopefully, better decisions.
actual project progress reaches the 90% complete point of the
project.
The results in Fig. 5 can be summarized as follows: Conclusions
• Early in the project, the ESM has significantly larger errors
than the CPM and the BBM. The forecasting accuracy of the A new probabilistic schedule forecasting method has been devel-
ESM improves over time. However, it still takes about 30% of oped. The Bayesian betaS-curve method is a probabilistic method
the planned project duration to get comparable MAPE to the that provides confidence bounds on predictions. It is also an adap-
other methods. tive method that starts with the original estimate of project dura-
• Although the ESM gives predictions of project duration that tion and adjusts the influence of prior performance information on
are not accurate early in the project, both the CPM and the prediction as actual performance data accrue. Furthermore, the
BBM have MAPE less than 5% even very early in the project BBM relies on project-level performance data, which make the
共even at 5% of the planned duration, for example兲, when reli- method seamlessly integrated into common project performance
able forecasts are most valuable. Therefore, both these meth- metrics, such as the earned value and percent complete in various
ods should give, on the average, forecasts of project duration units.
that are accurate enough for use by project managers through- Combined with a curve fitting technique, the BBM is ex-
out the project. tremely robust in quantifying all relevant performance informa-
• The BBM outperforms, on the average, the CPM at the first tion from various sources. The flexibility of the betaS-curve
two evaluation points. However, the MAPE of the BBM con- function enables the BBM to possess a greater capability of rep-
verges to that of the ESM, resulting in slightly larger mean resenting extreme cases and variability in project progress pat-
errors than the CPM during the following evaluation points. terns compared to common two-parameter S-curve functions in
Statistical tests revealed that all the MAPEs in Fig. 5 were the literature 共Kim and Reinschmidt 2007兲. Using two real
statistically different from each other at every time period at projects, it has been demonstrated that the BBM extracts more
the ␣ = 0.05 level, except for the BBM and the EVM at 30%PD information from common monthly progress reports and provides
and 40%PD, which were not significantly different. more informative predictions than the conventional methods,
The example here does not demonstrate that the proposed which can be used as an objective, quantitative indicator of sched-
method produces more accurate forecasts than the CPM network ule risk of ongoing projects.
method. It does demonstrate that the proposed method generates This paper also provides a comparative study about the life-
forecasts that are much more accurate than the ESM early in the cycle accuracy of three forecasting methods: the BBM, the ESM,
project and just as accurate as CPM forecasting, while being and the CPM. In order to draw statistically meaningful conclu-
much easier to use than CPM. It should be noted that the BBM sions, a large test set of artificial project data are generated inde-
does not require detailed information about the status of indi- pendently from these methods and used to obtain the mean
vidual activities as the CPM does or revised durations of future absolute percentage error 共MAPE兲 at different stages of a project
activities. Therefore, it can be concluded that, for the given life cycle. The results reveal that, for a project group scheduled
project, both the CPM and the BBM provide forecasts sufficiently with 200 activities, the ESM, even though it has been asserted to
accurate to be usable, and the BBM provides quick, and still, be the best EVM schedule forecasting method 共Vanhoucke and