Thedesignof efJicient storagehierorchiesgenerallyinvolvesthe
repeated running of“typical”programaddresstracesthrougha
simulated storage system whilevarious hierarchy design parameters
are d j u s t e d .
This paper describes a new and eficient method of determining, in
one pass of an address trace, performance measures f o r a large class
ofdemand-paged,multilevelstoragesystemsutilizingavariety of
mappingschemes and replacementalgorithms.
Thetechniquedepends on analgorithmclassification,called“stack
algorithms,” examples of which are“leastfrequently used,” “least
recently used,” “optimal,” and “random replacement” algorithms.
Thetechniquesyieldtheexactaccess frequent-v to eachstorage
device, which canbeusedtoestimatetheoverallperformanceof
actualstoragehierarchies.
Evaluation techniques for storage hierarchies
J. Gecsei, D. R. Slutz, and 1. L. Traiger
Increasing speed and size demandson computer systems have
resulted in correspondingdemands on storage systems. Since it
has been generally recognized that the speed and capacity require-
ments of storage systems cannot be fulfilled at an acceptable cost-
performance level within any single technology, storage hierarchies
that use a variety of technologies have been investigated.
Several previous papers describe the general concepts of hierarchy
d e ~ i g n ” ~and e v a l ~ a t i o n , ~whereas
-~ othersdeal with specific
hierarchy systems, such as thecore-drumcombinationonthe
ICT Atlas c o m p ~ t e r ” ~ andthe cache-core combinationonthe
IBM System/360, Model 85.’0’11
This paper introduces an efficient technique called “stack processing”
thatcanbe used in thecost-performanceevaluation of a large
class of storage hierarchies. The technique depends on a classifica-
tion of page replacement algorithms as “stack algorithms” for
which various properties are derived. These properties may be of
use in the general areas of program modeling and system analysis,
as well as in theevaluation of storage hierarchies. For abetter
understanding of storage hierarchies, we briefly review some basic
concepts of their design.
78 MATTSON, SLUTZ,
GECSEI, AND TRAIGER IBM SYST J
Thepurpose of astorage system is to holdinformationand to
associate theinformation with a logical address space known to
theremainder of thecomputer system. For example, theCentral
Processing Unit (CPU) may present a logical address to the storage
system with instructions to either retrieve or modify the informa-
tion associated with that address. If the storage system consists of
a single device, then the logical address space corresponds directly
to the physical address space of the device. Alternatively, a storage
system with the same address space can be realized by a hierarchy
of storage devices ranging from fast but expensive to slower but
relatively inexpensive devices. In such storage hierarchies, the
logical address space is often partitioned into equal-size pages
(or unequal-size segments) that represent the blocks of information
being moved between devices in the hierarchy.
A hierarchy management facility is included to control the move-
ment of pages and to effect the (generally dynamic) association
between the logical addressspaceandthe physical address space
of the hierarchy. When the CPU references a logical address, the
hierarchy management facility first determinesthe physical loca-
tion of thecorresponding logical page and may then move the
page to a fast storage device where the reference is effected. Since
these actions are “transparent” to the remainder of the computer
system (except for timing), the logical operation of the hierarchy
is indistinguishable from that of a single-device system.
The goal of the hierarchy management facility is to maximize the
number of times logical information is in the faster devices when
being referenced. As this goal is approached, most references are
directed to the fast, small stores whereas most of the logical address
space is distributed over the slower, largestores. The storage
system thenacquirestheapproximate speed of the fast stores
while maintaining the approximate cost-per-bit of the slower and
less expensive stores.This increase in cost-performance is the
primary justification for storage hierarchies.
Clearly, many factors can affect the cost-performance of a storage
hierarchy. On the performance side, one must consider the capacity
andcharacteristics of each storage device, the physical structure
of the hierarchy, the way in which information is moved by the
hierarchy management facility, and the expected pattern of storage
references. On the cost side, the hardware and/or software required
to find and move logical information must be considered, as well
as the cost-per-bit and capacity of each device. Because of these
factors, it is quite difficult to design an “optimal” hierarchy.
The typical approach to hierarchy evaluation employed by computer
designers has been to simulate asmany hierarchy systems as possible,
at various levels
of During
the first stages of design, a
largenumber of relatively simple simulations may be run with
NO. 2 . 1970 STORAGE HIERARCHY EVALUATION
Figure 1 linear storage fixed, standardaddress traces. These traces are assumed to be
hierorchy “typical” sequences of storage references obtainedfrom existing
computer systems, and they are used to approximate the reference
0 GENERATOR behavior of future systems. The purpose of these simulations is to
measure such statistics asdata flow and frequency ofaccess to
each device inorder to estimate the overall performance of an
actual system. The resulting performance estimates can then be
used tonarrowthe fieldof possible designs, which then receive
more detailed examination.
Alternatively, one may try to develop analytical techniques that
avoid point-by-point simulation but still yield accurate statistics
for data flow and access frequencies. Several papers deal with such
techniques for hierarchy e ~ a l u a t i o n . ~ - ~ general,
In theapproach
here is to run a relatively small number of simulations and ex-
trapolatethe measured statistics to a larger class of hierarchies.
The difficulty with this approach is the need for various assumptions
aboutthe statistical properties of address traces anddata flows
required toformulatethe analytical equations. Moreover, it is
difficult to include aquantitative dependence on such factorsas
datapath structure, page replacement alg~rithrn,’~ andaddress
mapping ~ c h e m e so
, ~ that many simulations may still be necessary.
objectives Thispaper presents a technique thatcanbe used to circumvent
of the much of the simulation effort required in hierarchy evaluation.
paper Specifically, we present an efficient procedure that determines, for
a given address trace, the exact frequency of access to each level
of a hierarchy as a function of page size, replacement algorithm,
number of levels, and capacity at each level. In the following, we
consider a class of multilevel, demand-paging hierarchies14 with
the same replacement algorithm at everylevel. The procedures
developed here are applicable to a large class of well-known re-
placement algorithms having certain inclusion properties defined
later. These algorithms-which we call stack algorithms-include
“least frequently used,” “least recently used,” “optimal,”anda
“random” replacement algorithm.
The system model
basic An H-level paged storage hierarchy consists of a collection of
model storage devices M I , M 2 , . . . , M H , a network of data paths con-
concepts necting the devices, anda hierarchy management facility. Each
device is partitioned into physical blocks called page frames. For
convenience, the highest-level store M , is called the local store
andthe lowest-level store MH is the backing store as shown in
Figure 1. The hierarchy management facility controls page move-
ment between the devices and associates each logical page with
a physical page frame. Special storageand processing hardware
may be required, but they are not included in our model.
80 MATTSON, GECSEI, SLUTZ, AND TRAlGER IBM SYST J
References to the storage hierarchy are presented by a single device
called the generator, and they are sequentially serviced in the order
in which they are presented. References fromthegeneratormay
may represent the requests of several devices, such as the CPU and
thechannel,in an actual system. The time sequence of logical-
address references X = x,, xz, . . . , x L is called an address trace,
where eachaddress consists of n bits as showninFigure 2. The
set of 2” possible addresses is partitioned into 2k pages of 2n-k
logical addresses each. The high-order k bits of each address rep-
resent the number of the pagecontainingtheaddress, andthe
low-order n - k bitsrepresent thelocationor displacement of
theaddress within the page. Since informationmovement on the
hierarchyis accomplished by transferring pages between levels,
we can analyze space allocation and data movement for a trace X
by considering a corresponding page trace X k = x:, xi, . . . , x,“-
where each x: is the number of the pagecontainingaddress xt.
When we consider a given fixed page size, we omit the superscript k ,
and denote pages by x i .
A reference fromthegeneratorcan be serviced only fromthe
localstore M , . Thus if the desired page resides ina lower level
device M i , i.e. where i > 1, the hierarchymanagement facility
must bring that page up to M , for servicing. The hierarchy provides
a path for bringing pages up to M , , which may or may not require
staging through intermediate levels. Any temporary storage required
for bringing a page up to M , is included in the hierarchy manage-
menthardware,and is therefore not represented in our model.
In this paper we restrict our attention to linear storage hierarchies
in which the only paths for moving pages down the hierarchy are
direct ones from each level M ito level M i +,, where i = 1, 2, . . . ,
H - 1. The reasons for this restriction are discussed later in this
paper.Notethatthe four-level hierarchyinFigure 1 isalinear
hierarchy.
The capacity of the backing store is assumed to be at least 2k page
frames,and all logical pages initially resideinthe backing store.
At any time, each logical page resides in exactly onepageframe
of thehierarchy. A mappingfunction is associated with each hi-
erarchical level, and specifies for each logical page the page frames
it may occupy in that level. The mapping function is further defined
as :
Unconstruined if any page can occupy any page frame of the
storage device.
Fullyconstrained if each page can occupy only a single page
frame.
Partially constrained in all other cases.
In a latersection, we define a techniquecalled “congruence mapping”
that generates a whole spectrum of mapping functions.
NO. 2 . 1970 STORAGE HIERARCHY EVALUATION
Figure 3 Two-levelhierarchy For simplicity in developing techniques for analyz,ing storagehi-
erarchies, we first consider a two-level, demand-pagedhierarchy
GENERATOR with unconstrainedmapping.Later,ourresultsareextended to
certain classes of multilevel linear hierarchies employing the three
types of mapping functions. The local store or buffer has a capacity
G
of C pages, and is directly connected to the backing store as shown
BUFFER STORE in Figure 3. Attime t , thegeneratorpresentsarequestforpage
x, tothe hierarchy.Under demandpaging, if x, is inthe buffer,
the reference proceedsand no page movement occurs. Otherwise,
x, is brought to the buffer fromthebackingstore. If the buffer
is already full, x,replaces some page y , in the buffer. The selection
1 BACKING STORE
of the particularpage y , is performed by the buffer replacement
algorithm. This operation is a key element of storage management.
In the two-level hierarchy shown in Figure 3, a reference to a page
residing either at level M , or at M , is called an access to that level.
For a given hierarchy and page trace, we define the access frequencies
F, and F, where F, is the relative number of accesses to level M,
during the processing of the trace. Thus, if N , accesses are made
to level MI, and N2 = L - Nl accesses are made to level M2, we
obtain F, = N , / L and F, = N J L .
Some important measures of storagehierarchyperformance can
be obtainedfromthese access frequencies. For example,one can
combine access frequencies with a set ofeffective access times
{ T < ]to obtain an effective (or average) hierarchy access time
T = F,T, + EAT2
In general, access times depend on the access paths, device access
times, andcharacteristics of thehierarchymanagement facility.
The access frequencies depend only onthe page trace,capacity
of the buffer, and replacement algorithm.
For a two-level hierarchy, accesses to the buffer are called successes;
the relative frequency of successes asafunction of capacityis
given by the success function F(C). For a given capacity C, page
trace X = xl, x,, . . . x,,,replacementalgorithm,andarbitrary
time t (where 1 5 t 5 L), the set of pages in the buffer just after
the completed reference to x,is denoted by B,(C). The initial buffer
contents is represented by B,(C). By convention
BdC) = 4
for all C where 4 is the empty set. The set of distinct pages referenced
in xl, x2, . . . , x,is denoted by rt,and the number of pages in rt
is denoted by
Y1 = I r t l
82 MATTSON, GECSEI, SLUTZ,
TRAIGER
AND IBM SYST J
I Figure 4 Determining success function bybuffer
I
1 2 3 4 5
simulation
6 7 8 9 10
I PAGE TRACE I a b b b a d c a a
SIMULATIONS
c=2
F(2) = 0 30 I
:(3)=050 1
-
a
b
c
d
-
Thus attime t , the buffer still contains theC most recently referenced
pages. It is easy to see that under LRU the buffer contains the C
most recently referenced pagesfor all subsequent times, and that
thisproperty holds for all pagetracesand buffer capacities.One
cangenerate the buffer contents B,(C) foranytime t on a trace
and any capacity by scanning backward from point t and collecting
the first C distinct pages encountered.
Since the set of C most recently referenced pages is always contained
inthe set of C +1 most recently referenced pages, the buffer
contents B,(C) at any time must be a subset of B,(C +
1). In fact,
B,(C) is apropersubset of B,(C +
1) if at least C +
1 distinct
pages have been referenced through time t. More formally, under
LRU replacement,the buffer contentsforany page trace X =
x,, x z , . . . , x,, and any time t (where 1 5 t 5 L ) satisfy the fol-
lowing inclusion property:
where
84 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
and
I&(C)( = Yl for cL Yl
The inclusion property can be observed in Figure 4 where at time
t = 5, for example
&(I) = (bl
Bd2) = {c, b J
&(3) = ( a , b, c l
and
B44) = ( a , b, c i
Because of the inclusion property, the buffer contents at any time
and for all capacities can be represented in the following compact
and useful way. We order the set of pages rt into a list S, = s t ( l ) ,
s,(2), . . . s k ~ , ) where
,
s,(i) = B,(i)- B,(i - 1) for i = 1, 2, . . . , y t (2)
Hence
IstU), s,(2), . . . , &(C>l for c IYa
Bt(C) = (3)
( s t ( l > .s , ( 2 ) , . . . , St(Y,)J for c 2 Yf
The list St isreferred to as the LRU stack, with s,(l) as the top
entry and st(?,) as the bottom entry. As an example, the LRU stack
for t = 5 in Figure 4 is
$5 = [b, c, a1
The stack S,, at time t = 0 has no entries and is therefore called a
null stack,that is, one with no entries. Theentire sequence of
LRU stackscorresponding to Figure 4 is includedinFigure 5.
Besides representing the buffer contents for all capacities, the LRU
stack can be used to efficiently determine the success function
F(C). Letussuppose that at time t , page x , has been previously
referenced and thus is a member of at least one set B,-,(C), where
1 5 C _< Let C, denotethe least buffer capacitysuch that
x1 E Bl-dC)
We call C, the critical capacity since, from the inclusion property
given in Equation 1, xI E B,-,(C) if and only if C 2 C,. If x, has
not been previously referenced, we set C, = because xt is not
'
I containedina buffer of any finite capacity.
From the definition of LRU stacks in Equation 2, it may be seen
that C, is simply the position of page x, in the stack St-,, so that
NO. 2 . 1970 EVALUATION
HIERARCHY
STORAGE 85
I Figure 5 Sequence of LRU stacks
TIME 2 3 4 5 6 7 8 9 1 0
PAGE TRACE ; : h h c b a d c a
STACK
DISTANCE m m l m z 3 m 4 3 1
DISTANCE
COUNTERS n(,\)
1 0 0 1 1 1 1 1 1
1 0
2 0 0 0 0 1 1 1 1
1 0
3 0 0 0 0 0 1 1 1
2 0
4 0 0 0 0 0 0 0 1
1 0
m 1 2 2 3 3 3 4 4
Figure 6 Obtaining success
x1 = St-,(C,)
functionfrom
distancefrequencies We call this page position the stack distance Ai, since A, is essentially
DISTANCE
A FREQUENCY
the “distance” from the top of the stack to
X, = s/-I(At)
1:::m
(Note that here A , = C,. When constrained mapping functions are
considered,thestackdistancemay not always equal the critical
capacity.) I f x, hasnot been previously referenced, then A , is set
to infinity. The sequence of stackdistancesforour example is
included in Figure 5.
B SUCCESS FUNCTION
The significance of stack distances is that they lead directly to the
success function. To see this, let n(A) be the number of times the
0 40 stack distance A is observed in processing a trace. Since the stack
distance equals the critical capacity, the number of times that the
referenced page is found in the buffer is
0 20 I F, = F(3) = 0.50
0
0 2 4 6 8 1 0
C- andthe success function is given by the expression
F(C) = N(C)/L (5)
In practice, the set (n(A)] can be determined from a set of distance
counters, as shown i n Figure 5. All countersare set initially to
zero, and the counter for each distance A is incremented whenever
86 MATTSON, GECSEI, SLUTZ, AND TKAIGER IBM SYST J
thatdistance occurs. For k-bit page numbers, we need atmost
2k +1 counters,corresponding to 1 5 A 5 2k and A = m . At
the conclusion of a pagetrace, the final values of the distance
counters are thevalues { n(A)) , and F(C) is obtained from Equations
4 and 5.
We now calculate the value of the success function in a numerical
example. For A's of 1, 2, 3, 4, and a , the cL#rresponding final
counter values in Figure 5 are 2, 1, 2, I , and 4. This distribution
is shown in Figure 6A. Dividing by L equals 10 in Figure 5 , and
summingcumulatively, we obtainthe success functionshownin
Figure 6B. One can verify that the F(C) values for the curvein
Figure 6B agree with those obtained in the simulations of Figure 4.
To find the access frequencies F, and E2, for a given buffer capacity
C, we take F, = F(C,) and E2 = 1 - F,. As an example, for C = 3
pages, F, = F(3) = 0.50 as indicated in Figure 6B, F2 = 1 - 0.50 =
0.50, and the average access time T of the hierarchy is 0.50T, +
OSOT,.
Notethat F(C) is always amonotonic, non-decreasing function
of C for LRU replacement, since F ( C ) is obtained by cumulative
summationasshowninEquation 4. Also, F(C) never exceeds
( L - y l , ) / L for any capacity, because all pages initially reside
in the backing store.
To avoid constructingeach LRU stackseparately, we now give
an iterativeconstruction of St from S,-, and x,.Observe that at
every time t , the stack S, is simply the list of pages in rL, according
to their most recent reference. The most recently referenced page
is st(l) since s,( 1) = x t . The second most recently referenced page
is st(2), and s,(y,) is the least recently referenced page in I',.
Let us suppose that page xL has been previously referenced and
appears at position A on stack St-l. For time t , we know that x,
must be the top entry in St, because it is the most recently referenced
page.Consider now a page b at someposition j on St-, where
1 5 j < A. At time t - I , page b is the jth most recently referenced
page, and the intervening pages do not include x,.At time t , page xt
is added to this set so that page b must now be at position j +1
on stack St. If j is greater than A, page b must remain at position
j at time t , since the set of more recently referenced pages is un-
changed from time t - 1 .
The net effect of this page motion is shown in Figure 7A. Page x,
is moved to thetop of thestack, pages previously above xt are
down-shiftedoneposition,and all other pages retainthesame
position. If xt were not previously referenced, x, would be placed
on the top andall other pages would be down-shifted one position as
shown in Figure 7B.
NO. 2 . 1970 STORAGE HIkRARCHY EVALUATION
B,-l(C) c B,-,(C + 1)
lB,-I(C)l = c
JB,-,(C + 111 = c+1
and
x, CE Bt-dC + 1)
NotethatfromEquation 2, page s,-,(C +
1) is containedin
B,-,(C + I) butnotin B,-,(C). If page y,(C +
1) is neither
st -,( C + +
I) nor y t (C), then y,( C 1) is some otherpage z E B, - ,( C).
However, page z is included in B,(C), but not in B,(C I), which +
would violate the inclusion property.
We have given a necessary condition for stackalgorithms. The
samecondition is also sufficient, because if y,(C 1) is either+
y,(C) or s,-,(C +
I), then B,(C) is a subset of B,(C 1). Therefore,+
we conclude that a replacement algorithm is a stack algorithm if
and only if for every time 2
yt(C + 1) = sr-l(C + 1) or y,(C + 1) = yt(C) (6)
for
15 C < y,-] and C+ I <A,
Important replacementalgorithms that satisfy Equation 6 arethose stack
that induce a totalordering on all previously referenced pages and algorithm
use thisordering tomake replacement decisions. The ordering can identification
be represented in the form of a priority list
Pi = Pr(1)j pt(2), * . Pt(Tt-1)
9
wherep,(i)has a higher priority thanp,(i I ) for 1 _< i <+ The
algorithm then selects for replacement the page in B,-,(C) that has
the lowest priority.
A convenient notation for working with priorities is min(A), where
A is an arbitrary set of pages in r,+,,and min(A) is the unique page
in A having lowest priority on the list P , . If B,-,(C) B,-,(C 1) c +
+
and x, @ B,_,(C I), we can express the replaced pages JJ,(C)and
y,(C +1) as follow:
yl(C) = min [B,-,(C)I (7)
and
yt(C + 1) = min [B,-,(C + 1)1 (8)
=- min [B,-,(C),s,-,(C + I)] (9)
= min(min [B,-,(C)],s,-,(C + I)} (10)
= min [Yl(C),S,-I(C + 1)1 (1 1)
NO. 2 . 1970 STORAGE HIERARCHY EVALUATION 89
Equations 7-9 are based onthe definition of the replacement
algorithm, whereas Equation I O is based ontheproperties of
minimization.
We concludefrom Equation I 1 that any replacementalgorithm
that induces a priority list P, for every time t satisfies Equation 6
and is thereforea stack algorithm.For example, the priority list
for LRU is just the ordering of pages i n r f by most recent reference.
The priority list for “least frequently used” (LFU) replacement is the
ordering of referenced pages by mostfrequent reference together
with a scheme to break ties.
stack Before describing other examples of stackalgorithms,let us develop
updating a stack updatingprocedureforalgorithmsinducingaprioritylist.
For any page trace X = xl, x2, . . . , x,, andanytime t , where
1 5 t 5 L , suppose that stack Sf-, is available. Also, for any two
pages a, b E letmax (a, b) denotethepage having higher
priority. If x f has been previously referenced and appears atposition
A, on stack S t + , ,the stack at time t is given by
&(I) = x, (12)
s,(i) = max [y,(i - l), ~ , - ~ ( i ) ]for
2 5 i < Af (13)
.Y,(At) = y,(Af - I ) (14)
s,(i) = s,-,(i) for A, < i 5 Y,-~ (1 5 )
Equations 12, 14, and 15 are based on the constraints of demand
paging, whereas Equation 13 is derived from Equation 11.
If x f has not been previously referenced, the defining equations for
stack St are the following:
&(l) = x, (16)
s,(i) = max [yt(i - I), ~ , + ~ ( i ) ]for 2 5 i I Yf-l (17)
&(Y,> = Yt(Yt-1) (18)
In this case, Equations 16 and 17 express the fact that replacements
are requiredfor all buffer capacitiesin therange 1 5 C 5 y,-,.
Equation 18 corresponds to the new page x, being added to the
stack, with the result that a buffer of capacity
Yt = 71-1 +1
is now full.
Figure 8 illustrates the stack updating procedure as given in Equa-
tions 12-18. The top entry s,(l) is always x,,andthe first page
replaced is
y t ( l ) = ~ ( 1 ) for A, > 1
90 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
Figure 8 Stack updating
A PAGE xt IN STACK S t - ]
St- 1
Eachsubsequententry s,(i) is thendetermined iteratively from
~ ~ - ~and( i yi(i
) - 1) according to Equation 13 or 17. I f xi is found
on stack St+las showninFigure 8A, we use Equation 14 to
determine st(&). All lower entries are unchanged from time t - 1 .
If xi is not found on stack S t _ , ,as shown in Figure 8B, then A, = m ,
and we use Equation 18. In either case, the replacement algorithm
does not have to beapplied to all the pages forstackupdating.
Only a sequence of pairwise decisions between pages s,+,(i) and
y,(i - 1) is required.
Comparingourstackupdatingprocedurewiththeonefor LRU
showninFigure 7, we see that page y l ( C ) under LRU is always
S , - ~ ( C )In
. fact,theprioritylist P, is exactly equal to stack S,-,,
since both lists give theorder of pages in r,-, by mostrecent
reference. Thus
y,(C> = .Ll(C>
and Equations 13 and 17 then reduce to
s,(i) = max[s,-,(i - I), st -l(i)]
= s,+l(i- 1)
For an arbitrarystack algorithm, the stack updating is more complex
than for LRU,and the order of stack elements at time t - 1 may be
very different from that at time t .
Let us now examine several examples of stack algorithms. In general
any replacementalgorithm that bases its decisions onsome page
usage quantity, whether measured or predicted, naturally induces a
priority list and is, therefore,astackalgorithm.One example, of
NO. 2 . 1910 STORAGE HIERARCHY EVALUATION
course, is LRU, andanotherexample previously mentioned is
leastfreauentlv used (LFU) replacement.
that has been referenced the fewest number of times over the interval
1 5 T 5 t , orperhaps over some“backward window” interval
t - h 5 T 5 t , where 0 < h 5 t . If two or more pages are tied for
least frequency of use, then somearbitraryrule is used to break
thetie. As longasthe rule is consistentfor all pages and all
capacities (e.g., if the tied pages are numerically ordered) a priority
list P , is induced, and LFU is a stack algorithm.
Other examples of stack algorithms may arise in analytical studies
of programbehavior. If anaddresstrace is generated from some
random process, it may be desirable to study the behavior of
replacementalgorithms that base their decisions on theparam-
eters of therandom process. One such process is atime-invari-
ant, first-orderMarkov chain,’“16 where any page c is referenced
immediatelv after Dage b with a fixed transitionprobability T!,,..
(where b and c range over all referenced pages) and by the page
referenced at time t = 1.
probability” (LTP) since, 1
chosen for removal is the one that minimizes T,,, over those pages
in the bufl‘er. Supplyinganappropriateruleforbreaking ties, we
see that LTP induces a priority list and is a stack algorithm.
Another replacement algorithm is to remove the page with the
largest expected time until next reference. We call thisstrategy
L N R for“longest next reference.” The expected timesuntil next
reference canbeobtainedfrom the II-matrix by standard tech-
. ~ ~with LTP, L N R induces a priority list if we supply an
n i q u e ~ As
appropriate tie-breaking rule.
testing a Markov model of the program), page reference statistics
may be used to estimate the matrix n. For example, the observed
transition freauencies over some interval t - h to t can be used to
then be constructed for each time t , according to the probabilities
remains a stack algorithm.
Otherstackalgorithms may base their decisions oninformation
from the programmer or compiler, or on properties of the computer
system. For example, the programmer or compiler may supply to
92 AND
MATTSON,
SLUTZ,
GECSEI, TRAIGER IBM SYST J
should be given high priorities in theimmediatefuture.Another
case is where the operating system assigns priorities toprogram
pages in a .nultiprogrammed system, based perhaps on the position
of the program in a task queue. If all the pages in the address space
canbeorderedinaprioritylist P, foreachtime t , theresulting
replacement algorithm is a stack algorithm.
In the examples given, we see that prioritylistscanarise in a first-in/
variety of ways. We now consider a replacement algorithm called first-out
“first-in/first-out” (FIFO)that is not astackalgorithm.Under
FIFO, the page that has remained in the buffer for the longest
(continuous) time up to time t is removed.
A peculiarity of FIFO is illustrated by the following page trace
X = abcdabeabcde
As shown in Reference 18, the success function for this trace is not Figure 9 Success functionfor
monotonic,andtakestheform shown inFigure 9. Since stack FIFO replacement
algorithms have monotonic success functions, we conclude that FIFO ) -
is not astackalgorithmanddoes not induceapriority list P, at
every time t. In amplifying this conclusion, we note that the relative
-2
06C1 -
priorities between pages in I’+, may depend on the buffer capacity
C. Thus in theexample,onecan verify that page d has lowest
priority of all pages in B,(3) in the sense that d has been in the buffer 0 4c1 -
longest. However, page d has highest priority in B,i(4), since it was
brought into the buffer latest. 0 2c1 -
Whenever theprioritiesamong pages depend on thecapacity of C -
the buffer, we cannot define a single priority list thatapplies to 0
C”,
every capacity.Oneinstance of this is when prioritiesdepend on
the frequency of reference to pages after their entering the buffer.
Another case is when priorities depend on total time spent in the
buffer.
As longasprioritiesareindependent of capacity, and aslong
asonecanorderthe referenced pages to reflect these priorities,
thenstack-processing techniques can beused to find the success
function.
An optimum replacement algorithm
We now discuss a replacement algorithm that yields the maximum
value for the success frequency over thespace of all replacement
algorithms-for every page traceand every buffer capacity. Such
an algorithm is said to bean optimumreplacementalgorithm.
Belady13 describes an optimum replacement algorithm called
MIN, and shows how to evaluate the success frequency for a given
page trace and a given buffer capacity. I n the following discussion,
we describe a stack algorithm called OPT and prove that it is also
NO. 2 . 1970 STORAGE HIERARCHY EVALUATION 93
an optimum replacement algorithm. Using certain properties of LRU
and OPT. the entire success function for OPT can be determined in
two passes of a page trace.
OPT The replacementalgorithm OPT hasthe following characteristics.
Whenever a page must be pushed from the buffer, the chosen page
is the one whose next reference is farthest in the future. If a tie
results because two or more buffer pages are never referenced again,
Figure 10 Example of OPT the tie is broken by an arbitrary rule fl that pushes the page with
replacement thelatestalphabetical or numerical order.An example of OPT
replacement is shown in Figure 10, for the buffer capacity C = 3.
TIME 1 1 2 3 4 5 6 7 8 910 As an illustration, notice that at time t = 5 page c is pushed from
PAGETRACE a b c a d b a d c d
the buffer, since the other buffer pages a and b are referenced sooner.
At time t = 9, page b is pushed from the buffer, because page d is
BUFFER
CONTENTS referenced again (at time t = lo), and page a haspriority over
FORC=3 a a a a a a a a a a
page b by our rule R.
b b b b b b b c c
c c d d d d d d
Aformalproof that OPT is anoptimalreplacementalgorithm is
given in the Appendix. We note here that OPT is not realizable in
an actual computer system because it requires knowledge of future
page references. However, OPT does serve asa useful benchmark
foranyreplacementalgorithm, including stack-typealgorithms.
To show that OPT is a stack algorithm, observe that a priority list
P, can be constructed for OPT at each time t . Specifically, P , is the
list of the pages referenced again,ordered by their time of next
reference, followed by the list of the pages not referenced again, as
ordered by the tie-breaking rule fl.
stack The stack processing technique for OPT is illustrated in Figure 11.
processing Priority lists areorderedas described above,and curly brackets
example denote the pages orderedundertherule fl. For example, at time
t = 8 the priority list is P , = c, d, a, b, because c is the next page
Figure 11 Stack processing and success function for OPT replacement
TIME 1 2 3 4 5 6 7 8 9 1 0
PAGE TRACE
-I-
-1-
a
a
b
a
c
a
a
b
d
b
b
a
a
d
d
c
c d
PRIORITY
LIST
-i- a b
a
c
a
a
c
d
a
b
a
a
b
d
a
c
d
d
c
OPT STACK
b b b d d b a a
c c c c b b
STACK
DISTANCE
-I- m m m 2 m 3 2 3 4 2
C-
94 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
referenced (at t = 9) and d is the second page referenced (at t = lo).
Pages a and b are not referenced again, and thus are ordered accord-
ing to rule Q. The sequence of OPT stacks is constructed using the
prioritylists,andthe success function is obtainedfromthestack
distance frequencies. Amajor difficulty with thetechnique is the
amount of forward scanning required to construct the priority lists.
Fortunately,amore efficient procedure exists forobtaining the
priority lists. For a given page trace X,we define the forward distance
w,(a) to a page a at time t as the number of distinct pages referenced
in x, --
, x f,, (where x f ris the first reference to page a after time
t). If page a is not referenced again, the forward distance is defined
as infinity. Note that the priority list under OPT is a listing of the
pages in r,-, according to their increasing forwarddistances. An
illustrative example of forward distance determination is given in
Figure 12.
If the forward distances to all pages in I'-, are known at time t - 1 ,
the new forward distances at time t can be determined iteratively
fromthe single forwarddistance w , ( x , ) . Specifically, for page
a # x L and w , A wf(xf),we have
w,-,(a)- 1 for w,-,(a) 5 w , and w,-,(a) # a
w,(a) =
w,-,(a) for w,-,(a)> w , or w,_,(a)= a
(1 9)
To determine the sequence of forwarddistances { w , } fora page
trace X, consider the reversetrace X" = xl,, xL-lr . . . X 2 . XI. 3
Suppose that X" is analyzed according to L R U replacementand
that x i and x, denotetwo successive references to page a in the
reverse trace. Thus X" = x L , . . . , x , = a, . . . , x, = a, . . . , xl.
Attime j , the stack distance A, is thenumber of distinct pages
referenced in x,, . . . , x,+,. (Notethat x,,, precedes x, in X".)
However,thisnumber of distinct pages is precisely the forward
distance w , forpagetrace X. Thusthe sequence of L R U stack
distances for trace X",namely, A,,, A,,= I , ---
, A2, A,, is the reverse
of the sequence of forward distances w,,w,, I, w d for
, w,,=
trace X.
These results form the basis of a two-pass stack processing technique
for determining the success functionfor OPT replacement. The
technique is illustrated by Figure 13. The first pass is a backward
scan of the page trace X using L R U replacement, denoted by the
left-pointing arrow. The LRU stack distances are stored, in reverse
order, on a "distance tape." The second pass is aforward scan
using OPT replacement,as shown by theright-pointingarrow.
Forward distances read from the distance tape are used to maintain
the OPT priority lists according to Equation 19.
The L R U stack distances gathered from the reverse page trace yield
important information about the forward page trace. Specifically,
NO. 2 . 1970 EVALUATION
HIERARCHY
STORAGE
distinct page in the trace, the distance frequencies for X and X“ are
identical, so that the success functions FLIITT(C,X“)and F,,Itu(C,X )
are equal.
Another result, which is proved in the Appendix, is that FOPT(C, X)
is equal to F,,,(C, X“), where F,,,(C, X ) is the OPT success function
for trace X.Thus, our two-pass technique can be implemented with
forward-backward scans as well as with backward-forwardscans.
During the first scan, the success function for LRU is obtained, and
thedistancetapegenerated.Duringthesecond scan the success
function for OPT is obtained.
Random replacement
In the stack algorithms considered thus far, a unique success func-
tion is associated with each trace. We now extend stack-processing
techniques to cover a“randomreplacement”algorithm (RAND)
that does not always yield a unique success function. With R A N D ,
if the buffer has a capacity ofC, any given page is chosen for replace-
mentwithaprobability of 1/C. In analyzing R A N D , onemight
performa MonteCarlo simulation for each buffer capacity to
obtain a R A N D success function. Repeating these simulations would
yield a set of sample success functions to characterize R A N D . The
sample success functions could then beused to estimate an “average”
success function.
A question that arises is whether stack processing can be used to
generate a samplesuccess function for R A N D or any other algorithm
that bases a replacement choice on the value of somerandom
variable. We observe that R A N D is not a stack algorithm, because
there certainly exists a trace and a time t for which the inclusion
property fails to hold with a nonzero probability.
Our approach is to define a replacement algorithm RR, which is a
stack algorithm having the same statistical properties as R A N D for
each capacity C. The algorithm R R is defined as follows: at each
time 2, the priority list P, is obtained by randomly ordering the set
of pages in r,-l(each of the Y,-~!possible orderings is equally
likely to be chosen). Observe that RR is a stack algorithm, since it
induces a priority list.
To establish that RR is statistically equivalent to R A N D , assume
that a replacement is necessary in a buffer of capacity C at time t.
Since y,(C) = min [B,-,(C)],and P,is randomly chosen, the proba-
bility that any given page is y,(C) is l/C-the same as for R A N D .
One difficulty in implementing RR is the generation of the random
priority list P,.Fortunately, it is possible to update the stack without
actually constructing the entire priority list. Assuming that A, > j ,
NO. 2 . 1970 EVALUATION
HIERARCHY
STORAGE
let q,(t) denote the probability that page s,-,(j) has priority over
page y , ( j - 1) at time t. If s,+,(j) does not have priority over
y , ( j - l), we know that s,-,(j) = min [ l L I ( j ) ] Since
. this occurs
with probability l/j, we obtain
1 - d t ) = I/j
or
Using Equation 20, thestackcanbeupdated at time t for RR
replacement by choosing page s,(j) = s,+,(j) with probability
( j - l)/j, for 2 5 j < A , and j < Y,-,.As a check, let us compute
the probability Q that an arbitrary page b is pushed from a buffer
of capacity C at time t . Assuming that page b occurs at some position
k on stack S t - , where I 5 k 5 C, then Q is given by the following
expression:
Q = P , ( Y ~ ( C=
) b}
= P , { S , ( k ) = Y,(k - 11, s,(k + 1) = s,-,(k + 11,
s,(k + 2) = s,-1(k + 2), .. , S,(C) = S,-,(C)} (21)
The events in the joint probability in Equation 21 are independent,
so that we obtain
Q = P , ( s , ( k ) = Y,(k - 1)) *P,{st(k + 1) = S,-l(k + 1))
+
.P,(s,(k 2 ) = st-,@ + 2 ) ) . * . . .P,{S,(C)= S,-I(C))
= (!)(L)(*)
k k + l k + 2 ... (y)
Since Q = 1/C holds for any page b and capacity C, we have
verified that the stack updating for RR can be accomplished using
Equation 20, andthat RR hasthesamestatisticalproperties as
R A N D for each buffer capacity. Note that although a particular
value of a point on the success function, for example F(4) = 0.3, is
equally likely to occur under both R A N D and RR, the occurrence
of a particular success function is not equally likely.
As the example with RR illustrates,stack processing techniques
can be extended to cover probabilistic replacement algorithms. In
fact, a replacement algorithm can have a mixture of probabilistic
and nonprobabilistic aspects. For instance, the arbitrary rule used
to break ties in LFU and other algorithms may choose a page at
random. Another possibility is for a replacement algorithm to favor
some pages probabilistically in the construction of the priority list,
thereby realizing a so-called “biased replacement” algorithm.’2 In
any case, the only requirement is that the priority list be constructed
98 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
to reflect theprobabilisticproperties of the desired replacement
algorithm for every capacity C.
Congruencemapping
up to now, we have restricted ourattentionto two-level storage
hierarchies with unconstraintedmapping at the first level. Under
this type of mapping, any page in the buffer may be replaced by the
referenced page. The advantages of unconstrainedmapping are
that all available page frames in the buffer can be used, and also
that seldom used pages cannot become "locked" into the buffer by
mapping constraints. A disadvantage with unconstrained mapping
is that extensive associative searches may be necessary to locate
pages in the buffer. Moreover, the implementation overhead of the
replacementalgorithm may be excessive, since relative priority
informationmustbemaintainedfor all pages in the buffer. To
offset thesedisadvantages,aconstrainedmapping scheme can be
employed whereby each page is restricted to occupy a member of
only a subset of the buffer page frames.
One such mapping technique is called congruence mapping, by which
the 2k distinct pages in the addressspacearepartitionedinto 2"
disjoint congruence classes, where 0 5 a 5 k , and each class contains
2k-" pages. The classes are numbered consecutively from 0 to
2" - 1, and class membership is determined from the a low-order
bits of the page number. I n this case, the a low-order bits constitute
the class number [x] of a page, and the remaining k - a bits are
called the page prejxas shown in Figure 15. The quantity a is called
the classlength. For a class lengthequal to zero, we set [x] = 0
for all pages.
In a two-level hierarchy with congruence mapping, every congruence
class is assigned an equal number of page frames in the buffer-to
be used exclusively by members of that class. This number is called
the class capacity and is denoted by D. (The total capacity of the
buffer in pages is thus C = 2 " . D.) When a page x is referenced, it
may appear in any of the D page frames reserved for class [x].If the
reference page is not in the buffer, and if the D page frames are all
occupied by othermembers of class [x],areplacementalgorithm
selects one of these pages for removal. We assume that the same
replacement algorithm is used separately for each of the classes.
Note that when the class length a is zero, all pages are in the same
class, and the mapping is unconstrained. When the buffer capacity
C is a power of 2, and when C = 2'", only one page is allocated to
each class, and themapping function is fully constrained. Thus
for a fixed buffer capacity C = 2", where 0 5 h 5 k , we can vary
themappingfunctionfromunconstrained to partiallyand fully
constrained simply by varying the value of a from 0 to h.
NO. 2 . 1970 EVALUATION
HIERARCHY
STORAGE
Figure 16 Two-levelhierarchywithcongruencemapping
r
t
t
tt Tt t* I* tt t*
BACKING
STORE
Zk-n PAGES
PER CLASS
Z E L CLASSES
I
Since the congruence classes are disjoint, and since the same number
of buffer page frames are allocated to each class, it is possible to
treat a buffer as a collection of 2" distinct buffers-one foreach
class [x]. If we also view the backing store as 2" individual backing
stores, as shown inFigure 16, the two-level hierarchypartitions
into a collection of 2" distinctsubhierarchies,eachwith a buffer
capacity of D pageframes.Whenthereplacementalgorithm is a
stackalgorithm,thesesubhierarchiescanbeevaluatedseparately
using stack processing techniques. In practice, 2" stacks(onefor
eachsubhierarchy)canbemaintained as thetrace is processed.
Eachpage reference x causes only the stackfor class [x] to be
updated, and a stack distanceA to be determined from that stack.
In congruencemapping, to calculate the success functionfora
given trace and given class length (Y, the stackdistancesmust be
carefully interpreted. Whenever a stack distance A is measured, the
corresponding critical capacity of the entire buffer is 2" . A , since
this is the minimum buffer capacity necessary to contain the refer-
enced page. Therefore,the success function F"(C) forthe set of
capacities C = 2". D where D = 1, 2 , . ' ' , is given by
F"(C) = F " ( 2 " . D) = n@)
__
A=1 L
100 MATTSON, GECSEI, SLUTZ,
TRAIGER
AND IBM SYST J
where n(A) is the total number of times the distance A occurs for
any of the stacks.
Generally, stack processing techniques must be used separately for
each value of the class length a. However,for L R U replacement,
only a single stack need be maintained in order to determine the
success functions for all values of a in the interval 0 5 01 5 k . Recall
thatunder LRU, the stack is the list of allthe pages in rt+l
orderedaccording tomost recent reference. To formthe stack
Sf-l(i,a ) corresponding to congruence class i and class length a ,
one would list the pages in class i according to their most recent
reference. However, this ordering is preserved in the stack St+,for
any i and any 01. Therefore, S,-,(i, a ) can be determined by listing
in order all the stack entries of St-] belonging to class i. In practice,
it is not necessary to actually construct each stack S,-,([x,], CY) in
order to find thedistance A:. Onecandetermine all thestack
distances (A:) in one scan of the LRU stack S, -,. To do this, we
first define the righf match function RM(X, y ) for two page numbers
x and y as the number of consecutive low-order bits that match.
For example, ~ ~ ( lOl,OOlOl)
0 1 = 3 , and ~ ~ ( 0 0 0 0 , 0 0 0 1=) 0. Note
that the class numbers of two pages are equal ([x] = [y])if and only
if the class length satisfies the inequality 01 5 RM(X, y). Now suppose
that the current reference is to page x. and consider the jth entry
on stack St+,,which is y = ~ ~ - ~The ( j )occurrence
. of page y on the
stack will contribute to the distance A: if and only if RM(X, y ) 2 01.
Therefore, A: can be determined by counting the number of stack
entries y above (and including) page x that satisfy RM(X, y ) 2 a.
A simple procedure for determining A: for all a is to scan down the
stack, and maintain a set of right match frequency counters { p ( r ) )
for 0 5 r 5 k . Counter p ( r ) is incremented whenever R M ( ~ y, ) is
equal to r . If page x has been previously referenced, we eventually
find RM(X, y ) = k (corresponding to x = y ) , and each distance A:
is given by
6
A: = p(r) where 0 5 CY 5 k (23)
7=a
However, if page x has not been previously referenced, the bottom
of stack St+, is reached and A; is set equal to infinity for all class
lengths a. In either case, each distance A: is used to increment the
appropriate distance counter for class length a.
An example of this procedure is indicated in Figure 17. In Figure
17A, therightmatchfunctions are found by scanningdownthe
stack. In Figure 17B, the right match frequencies { p ( r ) } are plotted
in reverse order as a functionof r . Cumulative summation, according
to Equation 23, then yields the desired LRU stack distances { A: } .
Note that the stack distance for class length zero is the same stack
distance A asobtainedfor L R U replacement with unconstrained
mapping.
NO. 2 . 1970 HIERARCHY
STORAGE EVALUATION
Figure 17 Right match function for LRU replacement
A DETERMINATION OF RIGHT
MATCH
FUNCTIONS B DETERMINATION
OF STACK DISTANCES
PAGE x+
I I RIGHT
I MATCH
"1
I I I V//N//A n
Multilevel hierarchies
In previous sections of this paper, stack processing techniques are
developed to obtain the success function for a two-level hierarchy.
For each buffer capacity, this success function represents the relative
number of accesses to the buffer for a given page trace.
Wenow show that the same success function can beused to find
the access frequencies for all levels of a multilevel, linear hierarchy
for any number of levels, and any capacity at each level. Recall that
in a linear hierarchy, the only downward data path from each level
M , is to the next level MZTlr for 1 5 i < H. Also a path or sequence
of pathsisavailablefromeach level M i , for 1 < i 5 H, to the
localstore.Furthermore, no replacement decisions are required
when a page moves upward through intermediate levels. We now
assume that the same replacement algorithm is used at all levels,
andthatthemapping function is unconstrained at every level.
(Hierarchies with constrainedmappingfunctions are considered
later in this paper.) At time 1 = 0, the backing store contains all
pages, and these pages are moved to the local store M , on demand.
When M I is full, pages replaced in M , are pushed down to the next
lower level i n thehierarchy, M , . As each successively lower level
M , fills, the pages replaced in M , are pushed tothe next level
M , + ] . At level M , , thereplacementalgorithm is applied to the
102 MATTSON, GECSEI, SLUTZ, AND TRAlGER IBM SYST J
set of pages already present, thereby making room for the currently
referenced page x,. At the intermediate levels M i ,for 2 5 i < H ,
the replacement algorithm is applied to the set of pages in M iand
to the page pushed from level A 4 - , .
When page xL is accessed from some level M ; (for 2 _< i _< H - I),
a page is replaced from each of the levels MI, M,, . . . , Mi-,. The
page replaced from level M,_, is guaranteed to find space at level M , ,
since a page frame was vacated by x, . When page x, is accessed from
the backing store MI<,a page is displaced from each of the levels
MI, M,, . . ,until a vacant page frameis found. Note thatpositions
+
of pages in the hierarchy-and therefore the access frequencies-
do not depend on the structure of upward data paths to the local
store,but depend only on the replacementalgorithmandthe
capacity at each level.
We have shown that when a stack replacement algorithm is used Figure 18 Relationship between
stack and hierarchy
fora two-level hierarchy,the top C, pages of the stack arethe
levels
contents of a buffer of capacity C, as shown in Figure 18A. Let us
now assume that the replacement algorithm for a multilevel hier- A TWO-LEVEL HIERARCHY
archy induces a priority list at every time and that this list determines
the replacement decisions at every level of the hierarchy. If this is
true, then for any number of levels and any set of capacities C,,
C,, . . . , CI,, the contents of each level at any time can be determined
fromthe stackforthisreplacementalgorithm. More precisely,
let B;(C,) denote the contentsof level M , at time t , and let U , denote
the sum C, C, + +
... +
C,. We then claim that
B1(Ci) = B,(O-*)
- B , ( c T - ~ ) for i = 1, 2, . . . , H - 1 (24)
8 MULTILEVEL HIERARCHY
or equivalently that B:(C,) can be identified as the first C, entries of
stack S,, and B: can be identified as the next C, entries, etc. This
result is illustrated for a four-level hierarchy in Figure I8B.
The main elements of the proof of this result are as follows. Assume
thatEquation 24 is satisfied at time t - 1, and that page x, =
s l - , ( A L )is an element of B;-,(Cv) (i.e., level M , is accessed.) As
stack S , - , is updated to stack S , , page y,(C,) is removed from
the top C, elements of St-,, with the result that pages st( l), . . . ,
s,(C,) represent B:(C,). Now observe that page yL(C, +
C,) is
removed fromthetop C, + C, elements of I n terms of the
hierarchy, we know that y , ( C , ) is pushed to the next lower level M,,
since the hierarchy is a linear one. The replacement algorithm then
selects a page from y,(C,) + B;-,(C,) for removal from M,. Since
page y , ( C , ) has lowest priorityin B;-,(C,), the page selected for
removal has lowest priority in B;-,(C,) +
BT_,(C,). But this page
is y , ( C , + +
C,), so that .s,(l), . . . , s,(C, C,) represent B:(C,) +
B;(C,), and thus s,(C1 + l), . . . , s,(C, +
C,) represent BT(C,).
A similar argument applies to subsequent levels M , where 2 <i5
NO. 2 . 1970 STORAGE HlERARCHY EVALUATlON 103
~~~
g - 1. Page ,u,(ut-J is pushed from level Mz-lof the hierarchy, and
competes with the pages in Bi-,(C,). The replacementalgorithm
selects for replacement the page
min LV~(CT-~),
Bf-l(CJ]= min [EL,(CT,)I = yl(ut)
with the result that
B/((TJ = B:(C,) + B:(C,) + . . . + BZ(CJ
and
B;(Ci) = B/(U,) - &(UT-,)
At level M , , the page J I ~ ( U ~ , -that
~) has been pushed from M , - , finds
a vacant page frame, and all lower levels remain unchanged. Then
BP(C,) = Bf-,(C,) + Yt(U,-J - x/ = &(u,) - B/(U,-I)
and
B:(C,) = Bi-l(Cl) = Bt(ui) - B,(a,-,) for j > g
Thus we haveshown that Equation 24 is satisfied at time t.
The significance of thisresult is that astackdistance A, where
C, + + + +
. . . C,-l < A 5 C, . . . C,, corresponds to an access
to hieyarchy level M,, and the relative number of such A's is simply
the access frequency F , to that level. Thus
F, = 5
A=<,-,+l
fl
L
= ) ~ ( u ~ - ~ for
~ ( u ,- ) 1 5 g 5 H - 1
(25)
As with two-level hierarchies, all other accesses are directed to the
backing store so that
H- 1
FH= 1 - F,
1 =I
Figure 19 Obtaining access Thedetermination of access frequencies is illustrated graphically
frequencies from inFigure 19 fora four-level hierarchy. Notethatthe technique
success function
illustrated in the figure cannot be used for an arbitrary hierarchy
or success function. However, the technique can be used forany
I
G
r
1.00 linear hierarchy as long asthe replacement algorithm always induces
a single priority list for all hierarchy levels.
Our treatment of multilevel linear hierarchies can be extended to
0 50 include hierarchies with congruence mapping functions. We assume
thatthe same class length a is used for every level and that D,
page frames are allocated to eachcongruence class at level Mi.
Thetotal capacity of level M , is then
0
C, = 2".D, where I i 5 H. (26)
C-
Using the success function F"(C) andEquations 25 and 26, we
obtain the access frequency F: for each level as follows:
104 MATTSON,
GECSEI, SLUTZ, AND TRAIGER IBM SYST J
dated, but a store distance A' is recorded. The distributions { n'(A'))
and ( n " ( A s ) )canthenbe used to determinethe fetch andstore
access frequencies to each level of the hierarchy. It should be clear
that this technique also works if congruence mapping is included.
We can also consider a modified fetch-store design where the page
usage statistics are updated for a store operation even though no
page motion occurs. This change is incorporated by updating the
priority list for both fetches and stores. Thus, for modified fetch-
stores, the net change in our model is that the stack is not updated
for store operations.
Besides distinguishing fetches from stores, a computer system may
also distinguish the various sources of store requests. For example,
a "call-back" feature can be used by which a page in the buffer
is moved to the backing store if the page is stored into by an I/O
device. The motivation here is to free the buffer of pages not needed
by the CPU, and to service all I/O stores from the backing store.
For a call-back hierarchy, the generator must specify at least two
kinds of references-CPU references, and stores from theI/o channel.
Stack processing techniques can then be modified as follows. When
a CPU storeor fetch occurs, thestack is updatedinthenormal
way (except for special entries to be described later), and a distance
counter n"'"(A) is incremented.When an I/O storeoccurs, say
at time t , acounter n''"(A) is incremented. If page x, doesnot
occur on stack then S, is equal to If page xI doesoccur
on stack S t _ , ,then S, = St-, except that xt is replaced by the special
entry " # ." This entry, counted forall stack distance measurements,
representstheempty page frame caused by page xt returning to
the backing store. To ensure that empty page frames are filled as
soon as possible, all #-entriesare assigned the lowest priority
in replacement decisions.
The call-backfeaturecan beused in conjunction with thefetch-
store or modified fetch-store schemes. In all cases, the correctness
of the modified stack processing techniques can be established.
Since stack processing allows a large sample of "typical" address
tapes tobe analyzed, for manyhierarchy models, the efficiency
gained at the early stages of hierarchy design may be great enough
to impact the whole design process. More of these traces can be
processed in a given time, and more hierarchy designs can be evalu-
ated for a given number of traces. The availability of this data may
help justify the "typical"-trace approach to design, or may help in
the development of othermodelsfor system requirements. As an
example, programmodelscan be more deeply investigated by
evaluating both a program and its model under avery large number
of address traces. Improvement in program modeling, in turn, may
enhance the success of analytical disciplines that use these models,
such as storage interference studies for multiprogrammed systems.
106 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
paging) anotherreplacementalgorithm exists that uses demand
paging and causes the same or a fewer total number of pages to be
loaded into the buffer. This result is used to show that OPT is an
optimalreplacementalgorithm and, in fact, that OPT causes the
minimumtotalnumber of pages to be loadedintothe buffer.
Finally,it is shown that the success functionunder OPT forany
trace is identical to the success function under OPT for the reverse
of the trace.
Definition
IS1 denotes the number of elements in a set S .
la/, denotesthenumber of occurrences of asymbol a in a
sequence X.
A = { a, b, . . . } is a finite set of N page addresses or pages.
X = x),xl, . . . , x L is a finite sequence of L elements from A ,
and is called a trace.
B,(C) C A denotes the contents of a buffer of capacity C at time
t , and is called a state.
Throughout this appendix, we consider a two-level storage hierarchy
with fixed buffer capacity C. Consequently, we use B, instead of
B,(C). The term Br denotes the contents of the buffer immediately
after reference x,is made; B,, is called the initial buffer state; and 4,
the empty set, denotes an empty buffer state.
Dejinition
P = pl, p 2 , . . . , pr, is a finite sequence of 1, sets. p , C A , called
an 0-policy.
Q = ql, q2, . . . , qL is a finite sequence of L sets, q, A , called
an I - p o k y .
A policy is a particular realization of a replacement algorithm for
a given trace. For such a trace and initial buffer state B,,, an I-policy
andan0-policytogetherdeterminethe sequence of buffer states
that will occur during the trace. An I-policy gives the set of pages
loadedintothe bufTer, and an 0-policy gives the set removed. I f
p , = 4, no page is removed, and if q, = 4, no page is loaded in.
Note that only certainpairs of 0- and I-policies are meaningful.
For example, a page cannot be removed if it is not in the buffer.
We consider only meaningful policies, where q,+l $ B, and P , + ~G
B, + q , - , , for 0 5 t 5 L - 1. In this case, B,,, is obtainedfrom
B, by
Bt+1 = [Bt + qt+J - Pr+l
Dejinition
Let X be atraceand Bo (where ~ B , , 5 C) an initialstate. A
sequence of states B = Bo, B,, . . . , B,, is a valid sequence if x,E B,,
108 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
for 1 5 1 5 L. A policy pair P and Q is a validpair for X and Bo if
application of the pair results in a valid sequence.
Note that valid policy pairs are quite general in that any number of
pages may be moved into or out of the buffer. However, most of
our attention is directed toward demand paging where
1
btl 5 1 and lqtl I1
xt E Bt-1 * Pt = qt = 4
Pt # 4 * qt # 4 and lBt-ll = C
for all t , 1 5 t I L.
Under demand paging, single pages are loaded when necessary until
the buffer fills; subsequently, page swaps
occur only when necessary.
Onemeasure of goodnessfora policy pair P and Q is the total
number of pages loaded into the buffer
pair.The following theoremsupportsthe
e:=,
lqt 1 under the policy
usefulness of demand
paging.
Theorem 1
Let P and Q be a valid policy pair for X and Bo. There exists a
valid demand policy pair P" and Q" for X and B , such that
Proof. P" and Q" will beconstructed by forminga sequence of
valid policy pairs (P", en),( P ' , Ql), (P', Q'), . . . , (P", QR),where
P" = P , Q" = Q , P" = P", Q" = Q", and 1421 5 cf=11q2-1/
for 1 5 j 5 K. Informally, P' and Q' are constructed from Pi-'and
Q'" by altering pi" and q1-l to satisfy thedemand paging con-
straints where pi-' and/or qi-' arethe first occurrences of non-
demand paging in Pi-land Q'". This is done by "sliding" offending
elements ofpl" and/or qi-' to a later time in P' and Q'. If a E p i
and a E q; ever occurs then we trivially remove page a from both
p i and 42. Clearly, this does not disturb the validity of Pi and Q'
and only decreases the value of x:==,
1qi/.
To construct Piand Q' from Pi" and Q'", 1 I .j 5 K , let t be the
smallest time such that pi" and/or yi" do not satisfy Equation AI.
Set p i = pi+'and Q' = Q'", except as noted below. Suppose that
xt = a and that qi", for t < L, does not satisfy Equation A l . If
a @ qi-l, then set qi = 4 and q i + l = q:;: +
4i-I. (Note that "+"
is defined here since q2-l A pj" = 4). If a E qj-l, then set = a,
and = 41;: + - a]. If t = L , then set qt = 4 if a @
or 4; = a if a E q2-l. In all cases, note that Q' is valid, since
qi @ Bj-, for 1 I t 5 L , and that cf=l cf=,
1411 5 /qi" 1 .
NO. 2 . 1970 EVALUATION
HIERARCHY
STORAGE
Now suppose that p i - ' , for t < L , does not satisfy Equation A I .
We observe first that 192 1 5 1 and qi = a. if a !$ B ~ I : . If q; = + or
+
1Bi:;l < C, then setpi = and p;L1= p i ; ; + p:-'. If qi = a and
IB;T:I= C , set p i = b for some b E pi" and pi,] = pi;: +
[pi" - b]. (Note that p;-' # +, since IBi:; 1 = C and 4i-l # 4.)
For t = L , setp; = b E p;,-' if 4: = a and iBtI: = C, or p j = +
~
otherwise. In all cases, we observe that Piis valid, since p : C Bi-l
for 1 5 t 5 L . Since Pi and Q' satisfy demand paging at least up
throughtime 1, the desired demand policies must eventually be
obtained.Thusthe theorem is proved.
Before considering an optimumreplacementalgorithm we make
two observations. First, under demand paging, a valid policy pair
P and Q can be completely represented by specifying just the 0-
policy P. This follows from Equation AI because q, # + can only
occur when x1 = a and a B , - , (in which case we knowthat
q, = a). Second, for demand policies P and Q , we can use l+iI, as
an alternative criterion of goodness. To see this let u be the smallest
integer such that lBt 1 = C, t 2 u. Then l+lT, is given by the following
expression:
Since u in Equation A2 is not a function of the policies,
a constant and
x;=, /q,l is
optimum For a given trace X andinitialstate Bo let us define anoptimum
replacement policy pair P and Q as a pair that is valid and minimizes /q,1
algorithm over the class of valid policies. From Theorem 1 there always exists
an optimum policy pair which is also a demand policy pair. Since
(A3) holds for all demand policies we can find an optimum demand
policy pair if we can find a demand policy P" such that I+/ ,,I) 2 I+\,.
where P is any demand policy.
Definition
Let X be a trace, and let a E A be a page. The forward distance
d(a, x,) to page a from page x, is thenumber of distinct pages
occurring in x,,,,. . . , x,,where e is the smallest integer satisfying
e > t and x, = a. If no such e exists then d(a, x,) = m .
Definition
Let X be a trace and B , an initial state. A valid demand policy Po,
called an OPT policy, for X and B,, is defined as follows. For t = I , 2,
. . . , L , whenever p , # + is required then p 1 = a where
110 MATTSON,
SLUTZ,
GECSEI, AND TRAIGER IRM SYST J
STORAGE
HIERARCHY
EVALUATION 111
I which can also be written as follows:
Note this is the same form as Equation A4with T o replaced by
[T,, + ( a ) - { c ) ] and a replaced by c. If d(c, x;,+,) d(b,<
then we have a situation identical to that in the statement of Lemma
<
1 where Xnow is x ;, + 1 , . . . , x&.Settingp: = pkfor 1 k 5 i, - 1
and pi, = 4, we again consider Cases 1, 2, and 3. Since the “new”
X is strictly shorter than the original X, this situation can only occur
a finite number of times. Note that P’ is valid as far as it is specified
and that p:, ’ . . ,p;. contains one more 4 than pl, . . Pa,. 2
If d(c, > d(b, x , , + ~ )we
, set p: = p k for 1 k< < i, - 1
and p:, = 4, and consider two more cases. First, if pc = b, where p ,
is the first occurrence of b in X and 8 < i,, we set p: = pk, for
ia+1<k<L,andk#8andp:=c.HereB:=B,for8<t<L,
and as in Case 1, we see that 14 1 p , 2 14 I still holds. Second, if p c #
b, for e < i h ,we setp; = pk, i, + < <
1 k L , and k # i, and p i , = c.
Again we have B: = B, for i, <
t 5 L , but we note that pi, = 4,
whereas pi, = c # 4. However, since p i a # 4 and pi, = 4, the
relation 141p,2 l+lp still holds. I
I
Case 3B. p i , = 4. Since qi. = a we observe that lBia-lI < C.
Let 8 be the smallest integer such that pc # 4. If no such integer
exists, then let 8 = L +1. We set p: = pk for 1 5 k <
i, and con-
sider two cases. First, if i, < 8 then we set p: = p k for i, 1 + <
<
k L. Note that Q’ = Q except at times i, and i,. Since 1 B; 1 = I B, I
<
for i, 5 t L, we see that P’ is valid, and 141p, = 141prsince P’ =
P . Second, for the case i, > 8, note that x, = c, where c # a and
c # b. We setp: = pkfor i, 1 k + < < L and k # 8, and p: = 4.
Ifp, = b, then lB:l = /B,I for8 t < <
L , and 141r, = 1 2 +
/+Ip. Ifp, = a, then the buffer states at times e - 1 and 8 are:
B:-I = Tc-1 + {a} B: = Tc-I + {a}+ {c]
Bc-1 = Tc-1 + ( a ] + ( b } Bc = Tc-1 + ( b } + {c)
Rewriting the buffer states at time e as
B: = [Tc-l + ( ~ 1 +1 (a1
we arrive at a case similar to Case 3A. As in Case 3A, P‘ contains
one more 4 than P in the interval t = 1, . . . , e. Therefore, we treat
this case in the same way, with the result 141p, 2 l+lp. Finally, if
p, = d where d # a and d # b the buffer states at time can be
written as
B: = fTt-1 + { a ) + {c] - (dl1 + {dl
112 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
B4 = [T4-1 + { a } + {c} - {dl] + {bl
which again can be treated as in Case 3A.
Note that the situation where ib = 8 can not arise in Case 3B, since
b E B ( b - l .We havetherefore successfully exhausted the possible
cases, and Lemma 1 is proved.
Theorem 2
Let X be a trace, B,>an initial state, and P a valid demand policy
for X and Bo. If P o is any valid Om policy for X and Bo, then
I+lPO 2 l4lP.
Proof. We recall first that every OPT policy for X and Bo has
exactly the same number of 4's. To prove the theorem, weneed
only find any om policy P o such that l+lpv 2 /41p.To do this we
will construct a finite sequence of policies P', P 2 , . ' . , Pi,where Pi
is an OPT policy and l+lp 5 l+lpx 5 . . . 5 l + l p l .
P' is constructed as follows. Let i be the smallest integer such that
p i # p:, where po is an element of an OPT policy. Suppose that
p i = a and po = b. (Neither p i nor p o can be +, since both are
demand policies.) We observe that
Ba = T'
BY =
+
+ (u}
Ti
(bl
I for a, b !$ Ti
where d(a, x i ) I d(b, x;). Since x i # a and x i # b, it follows that
Treating B i as Bo, Bo as B& and x i + ' , . . . , xL
d(a, x i + l )5 d(b, xi+l).
as X,we can use Lemma 1 to find a policyp:.,, . . . , p t that contains
as least as many 4's as p <+,,. . . ,pL. We then define P1 = pi, . . . ,
Pi as
k,l s k s i -
p i b, k = i
1
ipl, i + 15 k l L
Note that P' is valid and that I+Ip 5 l+lp. Furthermore, pk = pi,
1 5 k 5 ll for some 8,2 i.
Policy P 2 is constructed from P' in a similar manner with the results
thatpi = p i , 1 5 k 5 6 where% > 8,and l+lp. ,Il + l P . . Since X i s
finite, construction of P', P 2 , . . . must result in P',for finite j , where
p i = p i , 1 5 k 5 L . It follows from l+lp 5 l+lpl 5 . . . 5 I + J I . i
that 1+1 5 I + l p i where Pi is an om policy andthetheorem is
proved.
Combiningtherelation in Equation A3 for demand paging with
Theorems 1 and 2, we have the following theorem.
NO. 2 . 1970 EVALUATION
HIERARCHY
STORAGE
Theorem 3
OPT Let X beatrace, Bo an initialstate,and Po a valid OPT policy.
minimizes (Also,let Qo bethecorresponding I-policy.) For any valid policy
page pair P and Q,
loading
L L
t=1 t=1
Thus we see that an OPT policy results in a minimum number of
pages being loaded into thebuffer over the class of all valid policies.
After giving preliminary Lemmas 2 and 3, we present a final theorem
concerning OPT policies.
Lemma 2
For atrace X , letthe set Bc representthefirst C distinct pages
referenced in X . For a buffer of capacity C, if P is a valid demand
policy for X and some B; C Bc, then P is a valid demand policy
for X and any BL C Bc.
Proof. Let i be the smallest integer such that xl, . . . , x contains
C distinct pages. If Bo C B , then, for any valid demand policy P ,
wehave B ; = Bc, sincep, = p2 = . . . = p i = 4. For B; C B, this
also holds, so P is a valid demand policy for X and B;. (Note that
for different initial states, Bo 2 B,, the Q policies will not be the
same.)
Lemmu 3
For atrace X : let the set Ec representthelast C distinct pages
referenced in X . For a buffer of capacity C, if P is a valid demand
policy for X and Bo, there exists a valid demand policy P’ with a
state sequence Bo, B:, B;, . . . , B,: such that B; = E, and !+Ip,
2
/+IP.
Proof. Let i be the smallest integer such that x ,, . . . , x,* contains
C distinct pages. Suppose,under policy P, that B,-, contains n
elements of Ec, i.e. 1 B ,-, n E , 1 = n. It follows that at least C - n
pages will be loaded into the buffer following time i - 1. Setting
p: = p k for 1 5 k 5 i - 1, we will specify the remainder of P’ in
such a way that exactly C - n pages are loaded into the buffer
following time t - 1, We observe that, since at most C distinct pages
are referenced following time i - I , we never need remove a page b
from the buffer where b E E(,. Thus, if a page must be removed at
time 4 for i 5 e 5 L , there always exists a page c, where c Ec, in
the buffer, and we set p: = c. If P’ is constructed in this manner,
I 114 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J I
andfromEquation A3 we have 141ps 2 l+lp. Furthermore, since
no page in E , is ever removed from the buffer following time t = i
and lE, I = C, we see that B: = E,.
1 Theorem 4
Let X = xl, . . . , x], be atraceand ” X = x,,,. . . , xI its reoerse. forward/
If P o is an OPT policy for X and B, = 4, and ‘Po is an OPT policy backward
for ‘ X and ‘B,]= 4, then /I#l,.o = 14/r,.o. OPT
Proof. Let us assumethatthetheoremdoes nothold.Thus,
without loss of generality, suppose that / I # 1 7 t , o = lr#lpo +k where k
is an integer and k > 0. If D distinct pages are referenced in X (and
in ‘X)and if D 5 C, the buffer capacity, then we have an immediate
contradiction, since 141po = I I # / ~ , , ~ ~ = L . We thereforeassume
D > C.
Let us denote the state sequence under PO as B,,, B,, . . . , BL. From
Lemma 2 we can set Bo = B, without disturbing the validity of P o .
From Lemma 3 we can alter Po such that BL = E,. Note that the
altered policy contains the same number of 4’s as Po, since Po is an
OPT policy. (Wesubsequently refer to the altered policy as Po.)
. . . , ‘BLis the state sequence under ‘ P o we can
Similarly, if ‘Bn7 ‘B17
assume that ‘B, = ‘B,. and ‘BL = ‘Ecq.
‘BL-1, . . . , ‘ B 2 ‘B,.
Consider now the state sequence “BL, ’BL, , Since
xr,E ‘B1, xL-lE ‘B2, . . . , x2 E ’BI,-I, x1 E ‘BL, we see that this
sequence is a valid (not necessarily demand) sequence for the traceX .
Let us denote the corresponding valid policy pair as P’ and Q’. We
observe first that, since ‘Ec = Bc, we have ‘B,. = B,. = B,). Thus P‘
and Q’ (as well as P o ) are valid policies for X and B,. Next we
observe that‘B, = ‘Bl,-l + { ‘q: } - lrp: ) can be written as
‘B1,-, = ‘BL + (‘p:) - {‘q: ) . But we alsohave ‘BI,-l = ‘BL +
{ q; 1 - ( p i } ,which yields q; = ‘p: and p i = ’q:, since ‘p: A ‘q: =
4. Similarly, since ‘BL-, = ‘BI,-Z + {rqE-l} - {‘p:-,], we have
q; = ‘ p z - , and p i = ‘qE-,. Continuinginthismanner we can
show that
4’ -
1 -
10
PL+z-t
P: = 74LO+z-t I for 2 5 t 5 L
Now, since x L E ‘ B , (recall that ’ B , ) = ’Bc), it follows that
‘p: = ‘4: = 4. Similarly, since x, E B, (recall that B, = Bc), it
follows that p ; = q{ = 4. We can then trivially assume that p : =
‘q? and q: = ‘p:. The significance of this is that, using Equation A5,
we have established a one-to-one correspondence between P‘ and
‘eo, and between Q’ and ‘Po. In particular, I I # l p , = / + J V Q oand
I I # l o , = I $ l r p o . We now observe that I I # / ~ =~ ~ /+Irpo, since i‘Bo/ =
lrB,l = . . . = IrBL/ = C . In otherwords, ‘pq = if and only if
NO. 2 . 1970 STORAGE HIERARCHY
EVALUATION 1 15
Recall that P’ and Q’ are not necessarily demand policies. From
Theorem 1 we can find a demand policy pair P” and Q” such that
I Equation
From A5theand discussion that follows, we that
know
aredemand policies, and since lBol = Ill;’
1 = . . . = IB’; I = C,
we have
1p;’I = 1q:’I for 1 5 t 5 L. Combining these results yields
Butthen we have 1+, 2 = I+/rpo = I+lro + k . Since Po
was given as an OPT policy, we have from Theorem2 a contradiction
with 1+, > demand policy P”. Thus
for the our original
assumption is false, and it must be the case that I + l T r o = I+lPo.
CITED REFERENCES
1. A.Opler,“Dynamic flow of programs anddata throughhierarchical
storage,” Information Processing 1965, Proceedings of IFIP Congress
1,273-276 (1965).
2. E. Morenoff and J. B. McLean; “Application of level changing to a
multilevel storageorganization,” Communications of theAssociation
f o r Computing Machinery 10,3, 149-154 (1967).
3. C. J. Conti, “Concepts for buffer storage,” IEEE ComputerGroup
News 2, 8, 9-13 (1969).
4. W. Anacker and C. P. Wang, “Performanceevaluation of computing
systems with memory hierarchies,” ZEEE Transactions on Electronic
Computers EC-16, 6,764-773 (1967).
5 . R. L. Mattsonand J.-P. Jacob, “Optimization studies for computer
systems with virtualmemory,” Information Processing 1968, IFIP
Congress Booklet I , 47-54 (1968).
6. J. E. Shemer and G. A. Shippey, “Statistical analysis of paged and
segmented computer systems,” IEEETransactions on Electronic Com-
puters EC-15,6, 855-863 (1966).
7. J. Fotheringham,“Dynamicstorageallocation in theATLAS com-
puter, including an automatic use of a backing store,” Communications
of the Association for Computing Machinery 4, 10, 435-436(1961).
8. T. Kilburn, D. B. G. Edwards, M. J. Lanigan,and F. H. Sumner,
“One-level storage system,” IEEE Transactions on Electronic Com- i
puters EC-11, 2, 223-235 (1962).
9. M. H. J. Baylis, D. G. Fletcher, and D. J. Howarth, “Paging studies
made on the I.C.T. ATLAS computer,” Znformation Processing 1968,
lFIP Congress Booklet D , 113-118 (1968).
10. D. H. Gibson, “Considerations in block-oriented systems design,” AFIPS
Conference Proceedings, Spring Joint Computer Conference 30, Aca-
demic Press, New York, New York, 75-80 (1967).
11. S. J. Liptay,“Structural aspects of the SystemJ360 Model 85: I1 The
cache,” IBM Systems Journal 7, 1, 15-21 (1968).
1 16 MATTSON, GECSEI, SLUTZ, AND TRAIGER IBM SYST J
12. R. W. O’Neill, “Experience using a time-sharing multiprogramming sys-
tem with dynamicaddressrelocationhardware,” AFIPS Conference
Proceedings, Spring Joint ComputerConference 30, Academic Press,
New York, New York, 611-621 (1967).
13. L. A. Belady, “A study of replacement algorithms for a virtual-storage
computer, ZBM Systems Journal 5 , 2 , 78-101 (1966).
14. C. J. Kuehner and B. Randell, “Demand paging in perspective,” AFZPS
Conference Proceedings, Fall Joint ComputerConference 33, 1011-
1018 (1968).
15. C. V. Ramamoorthy,“Theanalytic design of a dynamic look ahead
and program segmenting system for multiprogrammed computers,”
Proceedings of the 21st National Conference o f the Association for
Computing Machinery, Thompson Book Company, Washington, D. C . ,
229-239 ( 1966).
16. J. Kral, “Oneway of estimating frequencies of jumps ina program,”
Communications of the Association for Computing Machinery 11,
7,475-480 (1968).
17. J. G. Kemeny and J. L. Snell, Finite Markov Chains, D. van Nostrand
Company, Inc., Princeton, New Jersey (1960).
18. L. A. Belady, R. A. Nelson, and G. S. Shedler, “An anomaly in spare-
time characteristics of certain programs running in a paging machine,”
Communications o f the Association for Computing Machinery 12,
6, 349-353 (1969).
19. P. J. Denning, “The working set model for programmingbehavior,”
Communications o f the Association for Computing Machinery 11,
5,323-333 (1968).
NO. 2 . 1970 STORAGE HIERARCHY EVALUATION 117