Workload Characterization
Workload Characterization
Performance Professionals
The Computer Measurement Group, commonly called CMG, is a not for profit, worldwide organization of data processing professionals committed to the
measurement and management of computer systems. CMG members are primarily concerned with performance evaluation of existing systems to maximize
performance (eg. response time, throughput, etc.) and with capacity management where planned enhancements to existing systems or the design of new
systems are evaluated to find the necessary resources required to provide adequate performance at a reasonable cost.
This paper was originally published in the Proceedings of the Computer Measurement Group’s 1986 International Conference.
Copyright 1986 by The Computer Measurement Group, Inc. All Rights Reserved. Published by The Computer Measurement Group, Inc. (CMG), a non-profit
Illinois membership corporation. Permission to reprint in whole or in any part may be granted for educational and scientific purposes upon written application to
the Editor, CMG Headquarters, 151 Fries Mill Road, Suite 104, Turnersville , NJ 08012.
BY DOWNLOADING THIS PUBLICATION, YOU ACKNOWLEDGE THAT YOU HAVE READ, UNDERSTOOD AND AGREE TO BE BOUND BY THE
FOLLOWING TERMS AND CONDITIONS:
License: CMG hereby grants you a nonexclusive, nontransferable right to download this publication from the CMG Web site for personal use on a single
computer owned, leased or otherwise controlled by you. In the event that the computer becomes dysfunctional, such that you are unable to access the
publication, you may transfer the publication to another single computer, provided that it is removed from the computer from which it is transferred and its use
on the replacement computer otherwise complies with the terms of this Copyright Notice and License.
Copyright: No part of this publication or electronic file may be reproduced or transmitted in any form to anyone else, including transmittal by e-mail, by file
transfer protocol (FTP), or by being made part of a network-accessible system, without the prior written permission of CMG. You may not merge, adapt,
translate, modify, rent, lease, sell, sublicense, assign or otherwise transfer the publication, or remove any proprietary notice or label appearing on the
publication.
Disclaimer; Limitation of Liability: The ideas and concepts set forth in this publication are solely those of the respective authors, and not of CMG, and CMG
does not endorse, approve, guarantee or otherwise certify any such ideas or concepts in any application or usage. CMG assumes no responsibility or liability
in connection with the use or misuse of the publication or electronic file. CMG makes no warranty or representation that the electronic file will be free from
errors, viruses, worms or other elements or codes that manifest contaminating or destructive properties, and it expressly disclaims liability arising from such
errors, elements or codes.
General: CMG reserves the right to terminate this Agreement immediately upon discovery of violation of any of its terms.
Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference
WORKLOAD CHARACTERIZATION
Gary M. King
lnternatlollal Business Machines Corporation,'
P.0. Box 390, Poughk••p, i., NY 12602
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
ABSTRACT
Workload characterization is an important early step in The three major resources of a central computing syste~
managing the performance of a system. A knowledge of the are the processor, storage and I/O. A workload charac-
resource requirements of each component of the workload terization should quantify the use of each of these re-
assists in both the current tuning and the future capac- sources by each component of the workload. Typical items
ity planning of a system. The workload characterization that characterize the use of these resources include:
process includes: defining the major component~> of the CPU time per transaction, working set size, paging and
workload, identifying the items which character'ize re~ daub.,. liD.
source usage, gathering m~asur~ment data and generating
the characterization. The process is not an exact sci- The performance of a workload component 15 determi ned by
ence and sometimes must allow for some llartful ll judge- its ability to obtain the resources it requires. There-
ments. fore, in addition to the quantification of resource de-
mand, the factors that influence the rate at which
Packages which provide "automaticll characterizations by resources can be consumed must also be understood IS part
analyzing SMF/RMF or other data are often llinhel'ited" by of a complete workload characterization. Examples of
new systems programmers and are also available C;Ofl'lller- influencing factors are: CPU queuing delay (function of
cially. An appreciation of the usefulness and accuracy relative dispatching priorities), paging delay (function
of these tools can only be obtained by "getting your feet of paging configuration and total paging rate) and data-
wet'l with some of the same data and techniques used by base I/O time (function of device types and I/O rates).
these tools. This paper intends to tak.e you anUe deep
into the process by presenting an example charac:teriza-
tion of a TSD/Batch workload.
SOURCES OF DATA
- 789 -
Generally, a wor~load characterization should concentrate The characterization of resource consumption of eack
on data obtained during the peak load hour(s) of the prime workload component is summarized in the following chart.
production sh1ft. Tun1ng and planning for the wor~load The data and technique~ used to fill in each row in the
active during this time 15 of most concern; using daily chart will be discussed in detail.
or weekly averages can mask what the system is expected
to deliver during peak load times. To gather the data,
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
RMF should be set to report at 30 or 60 minute intervals,
interval recording for SMF type 30 records should be set CHARACTERIZATION EXAMPLE ...
'to 30 mi nutes,
TRIV ,TRIV BATCH
THE SCIENCE AND ART OF TRANS/SEC IB.90 1. 51 .069
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
• 790 •
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
DOMAIN NUMBER ENDED AVG TRANS
NUMBER OF TRANS- TIME
SWAPS ACTIONS HHH.MM.SS.TTT SEPARATING TRIVIAL FROM NON-TRIVIAL
IN RMF MONITOR I WORKLOAD ACTIVITY
001 38 81 000.00.19.5B2 NON-TRIVIAL TRANS. TRANSITION
THROUGH TRIVIAL'S DOMAIN.
001 B6 27 000.07.24.131
ENDED TRANSACTIONS REFLECT
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
CPU TIME PER TRANSACTION A.ll TSO transactions lIbegin ll as trivial TSO transactions,
that is, all T50 transactions start in the first period.
Non-trivial transactions are recognized only after con-
CALCULATE TCB+SRB TIME FOR EACH suming more resource than is allowed under the definition
- WORKLOAD COMPONENT. of a trivial transaction. at which time they are switched
to the second period of the TSO performance group. Th,
CALCULATE TOTAL CPU TIME CONSUMED service units acc~ulated by non-trivial transactions
while they were considered trivial is accounted for in
CALCULATE "OTHER" CPU TIME the first period thereby inflating what is reported in
the trivial service statistics. For an accurate de-
DISTRIBUTE "OTHER" TIME IIMONG piction of lSO, so~e of the serviCe must be shifted from
- TNE WORKLOAD CQMPONENTS trivial to non-trivial transactions.
- NOT BAD ==> BY TCB+SRB TIME RATIO
- BEST ==> BY I/O ACTIVITY RATIO
• 791 •
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
Therefore. each non-trivial transaction has about SUM OF NOH-TRIVIAL DOMAINS' SERVICE
400+80=480 service units which must be removed from
trivial's statistics and added to its own statistics. ~ OF TOTAL
The 480 service units per non-trivial transaction that ICC = 1940665 19.9
must be shifted is composed of four different components CPU = 412B773 42.4
of service: 10C, CPU, MSO and SRB. It 1s assumed that MSD = 3347236 34.3
non-trivial transactions consume the different types of SRB = 330409 3.4
service in the same proportion throughout the life of the
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
. 792 .
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
TIME = S.U. / COEFF / SRM 3084 SU/SEC can be 1nterpreted as; on average each CPU in the system
was busy 69.77 hundredths out of each second. The total
CPU COEFF = 11.0 CPU time consumed during the interval is equal to the
SRB COEFF = 10.0 number of CPUs times the average utilization times the
SU/SEC = 346.1 number of seconds in the interval. Our example shows the
total CPU time consumed to be 5024.4 seconds during the
measurement inte~val.
TRIVIAL TSO TCB+SR8 TIME
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
= (1020317/11+234691/10)/346.1
= 335.8 SECONDS CPU ACTIVITY
NON-TRIVIAL TSO TCB+SRB TIME
= (4680719/11+374669/10)/346.1 INTERVAL 29.59.990
= 1337.7 SECONDS CPU WAIT TIME
BATCH TCB+SRB TIME NUMBER PERCENTAGE
(3450590/11+225073/10)/346.1
= 971. 4 SECONDS 0 21.84
26.23
Once the service ur.its are distributed properly among the 2 32.82
wor~load components, the reB and SRB times can be calcu·
lated by converting the appropriate service unft:s into 3 40.04
time. A service unit to time conversion factor I!xists
for each CPU type. Recent versions of RMF provide the TOTAUAVERAGE 30.23
factor in the upper right hand corner of the WORKLOAD
ACTIVITY REPORT (see earlier example report). Otherwise,
the IIInitialization and Tuning Guide u provides a table
of the conversion factors.
TeB time can be found by tak.ing the CPU service units, CALCULATE IIOTHER" CPU TIME
dividing by the CPU service definition coefficient Ind
dividing the result by the appropriate CPU service units
per second conversion factor. SRB t1me ;s found 1n the ·OTHER" ACCOUNTS FOR
salle manner using SRB service units. The service- defi-
nition coefficients are given at the top of the WORKLOAD • UNCAPTURED TIME
ACTIVITY REPORT (see earlier example report). Fc)!" our . TCB+SRB OF SYSTEM ADDR SPC
example, the TCB+SRB time was found to be 335.8 ~.econds
for trivial lSD, 1337.7 seconds for non-trivial TSO and OTHER = TOTAL CPU TIME MINUS
971.4 seconds for batch. These times represent the sum WORKLOAD COMPONENT TCB+SRB TIME.
of the time spent for all transactions of each type during
the measurement interval.
OTHER CPU TIME
=TOTAL - TRIV - NONTRIV - BATCH
TOTAL CPU TIME CONSUMED = 5024.4 - 335.8 - 1337.7 - 971.4
= 2379.5 SECONDS
RMF MONITOR I CPU ACTIVITY
AVERAGE CPU UTILIZATION EQUALS The difference between the total CPU time consumed and
100 - AVG WAIT TIME PERCENTAGE. the TCB+SRB time attributable to the workload components
during the measurement interval 15 the hidden or "other"
TOTAL CPU TIME CONSUMED EQUALS CPU time. This time represents two major categories of
- AVERAGE CPU UTILIZATION TIMES CPU consumption. The largest category i~ "uncaptured"
RMF INTERVAL TIME IN SECONOS ti.e - time not charged to any address spacels TeB or SRB
TIMES NUMBER OF CPUS. statistics. The uncaptured time includes CPU consumed
by various system services th.t cannot •• sily b. charged
to the requesting address space, for example, I/O inter-
AVO CPU BUSY = 100 - 30.23 rupt handling. Th@ second category of Il othl8r" CPU time
= 69.7~ 1s the TCB+SRB time accounted to system address spices
such as JES and VTAM. These address spaces are performing
TOT CPU TIME = .6977 • 1800 • 4 services 1n behalf of the major wor~load components. In
= 5024.4 SECONDS the example. the total Il other lt time 1's 2379.5 seconds.
- 793 .
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
NOT BAD ==> BY TCB+SRB TIME RATIO
RMF MONITOR I WORKLOAD ACTIVITY
RMF MONITOR I DEVICE ACTIVITY
TCB+SRB % OF TOTAL RMF MON I PAGING/SWAPPING OS ACT.
TRIVIAL 335.8 12.7 SUM NON-PAGING DEVICE ACTIVITY
NON-TRIVIAL 1337.7 50.6 RATES ==> TOTAL DATABASE I/O
BATCH 971. 4 36.7 PER SECOND FOR SYSTEM
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
OTHER % OF TOTAL
The I/O activity to the database volumes of the system
TOTAL 2379.5 100.0 must be distributed among the workload components. This
1s accomplished by first determining the total non-paging
TRIVIAL 3D2.2 12.7 I/O rate of the system and then distributing it vi. the
NON-TRIVIAL 1204.0 50.6 proportion of IOC service units among the workload co.-
BATCH 873.3 36.7 ponent$.
The "other" CPU time must be distributed among the work- TOTAL DATABASE 1/0 PER SECOND
load components. My own studies have shown that the most
,ccul"'ate distribution of this Hother" time is madE~ by the
ratio of 110 activity among the workload components. For SUM TOTAL I/O'S AND SUBTRACT
example, if trivial TSO accounts for 20% of the I/O's in PAG1NG liD'S
the system, it would be charged with 20% of the "otner ll
time. The I/O's for a workload component would include
both paging and Gatabase (i .e. non·paging) I/O. For
I SUM ACTIYITY RATE
this example, it is pr@matul"'e to distribute basecl on I/O ACROSS ALL LCU'S 495 / SEC
since that part of the characterization has not yet been
addressed. Therefore. as a reasonable a1ter"4th'e. the SUM ACTIVITY RATE FOR
1I 0ther
ll
CPU time i ~ di st ri buted in the same proportion ALL PAGING DEVICES = 64 / SEC
as TCB-+-SRB time. An ambit10us reader can try a clistr1b-
uti on based on I/O at the conclusion of the example
characterization.
TOTAL DATABASE I/O PER SECOND
= 495 - 64 = 431 / SEC
CONVERT COMPONENT CPU TIME
TO CPU TIME PER TRANSACTION
The I/O activity of the system is found in the RMF Monitor
DIVIDE COMPONENT TIME BY NUMBER I DIRECT ACCESS DEVICE ACTIYITY report; a subset of an
O' TRANSACTIONS IN INTERVAL. example report is rearoduced below. The DEVICE ACTIVITY
RATE column lists the 1/01s per second to each volume;
the rate for all the volumes on a LCU is summarized in
TRIVIAL TSO TCB'SRB CPU TIME/TRANS the LeU row following e.ch group of volumes. For example,
335.8 I 34205 devices 260-267 mak.e up LeU 018 and their total activity
= .0098 SECS I TRANS was 17.788 I/O·s per second.
TRIVIAL TSO "OTHER" CPU TIME/TRANS Summing all the LCUs· activity rates for the example
= 302.2 / 34205 system indicated a rate of 495 1/0·s per second. However,
= .0088 SECS / TRANS some of the volumes across some of the LeU's were used
as paging devices. Since paging will be trelted sepa·
rately in the characterization, the activity for these
ETC. FOR OTHER WORKLOAD COMPONENTS paging volumes must be removed from the total for the
system. Summing just the activity rate on the paging
volumes yielded 64 paging 1/0 1 s per second. Thus, the
total database activity for the system was 49S-64=431
The CPU time per transaction for each workload cClmponent I/Ols per second.
can now be calculated by simply dividing the toU,l time
for the component by the number of transactions cClmpleted
during the measurement Hlterval. TCS-+-SRS and "other" CPU
time are k.ept separate for use in analyzing the CPU
queui ng de lay component of response t 1me. CPU qlleu1 ng
time is a function of the relative dispatching priorities
of the workloads. I n genera 1, the lIother" t 1me a,cross
all workload components runs at a higher priorit)' than
- 794 -
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
NUM SERIAL RATE TIME
USING MSD AND CPU, CALCULATE
260 C45IOI 018 0.333 12 AVERAGE REAL STORAGE FRAMES
261 TSOO06 018 10.173 25 PER TRANSACTION
267 LARG05 018 0.000 0 AVG REAL STG FRAMES PER TRANSACTION
LCU 018 17.788 22 = 50
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
. 795 .
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
by its domain. Logically swapped users are recognized
by an IILO" in the CL column. By averaging the RSF (Real DIVIDE NUMBER OF SWAPS BY
Storage Frames) for all logically swapped users for each ENDED TRANSACTIONS
workload component, the average logical swap size can be
found.
- 796 -
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
TOTAL TOTAL
- TAKEN DIRECTLY FROM RMF
TERMINAL CT 36917 3382 MONITOR I SWAP PLACEMENI
INPUT/OUTPUT RT 20.51 1.88 AVG PAGES PER SWAP IN
WAIT % 98.7% 9.2%
- ASSUMED TO BE THE SAME FOR
LONG CT 55 46 ALL COMPONENTS
WAIT RT 0.03 0.03
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
% 0.1% 83.6%
LONGER WAY
OETECTEO CT 302 278
WAIT RT 0.17 0.15 - USE RMF MONITOR II TO DERIVE
% 0.8% 92.1% AVERAGE FOR EACH WORKLOAD
COMPONENT
- ASD REPORT: DMN, RSF=O, WSIN
TOTAL CT 37410 3842
RT 20.78 2.13
% 100.0% 10.3% Given that a swap did occur to auxl1tary storage, how many
pages were included in the swap group? The easiest way
to approximate th1s is to use the AVG PAGES PER SWAP IN
number from the SWAP PLACEMENT report and assume it ap-
plies equally among all workload components. A better.
but longer, method makes ~se of the RMF Monitor II ASD
AUX SWAPS PER TRANSACTION report. An example report appeared earlier in this pi-
per. The WSIN column gives the size of the swap group.
Any address space that has 0 RSF" (Real Storage Frames)
APPLY AUX SWAP ~ TO WKLD COMPONENTS has been swapped to auxiliary, thus its WSIN reflects its
current swap group size. By averaging the WSIN of all
- TRIVIAL AND NON-TRIVIAL TSD address spaces in each workload component (identified by
HAVE ONE TERMINAL WAIT SWAP domain) that have R$F" of 0, an accurate value for pages
AND REST ARE NON-TERMINAL WAIT per auxiliary swap can be derived.
- 797 •
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
DISTRIBUTE LOCAL PAGE-INS/SEC
BETWEEN TSO AND BATCH VIA -- % DELAYED FOR --
PERCENTAGES IN RMF MON III JOBNAME OMN LOCL VIO
LAURIE 3 5 o
TOTAL LOCAL PAGE-INS/SEC = 35.12 WAYNE 3 5 o
STEVE 4 4 I
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
. 798 .
job aoes 30 page-ins. From the RMF Monitor I WORKLOAD field .long with all the swap page-outs. These local
ACTIVITY REPORT, the w~1ghted-average elapsed time across page-outs can be isolated by subtracting the swap page-in
all batch domains is 159 seconds while thlll weighted- rate (126.06 from the example report given earlier) from
Iverlge non-trivial TSO transaction lasts about 12.8 the swap page-out rate (128.13 from the report below).
seconds. If a batch job does 30 page-ins in 159 seconds, The total local page-out rate of the system is
th.n we will assume the average non-trivial TSO trans· 33.10+2.07=35.17 pages per second. The page-outs per
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
.ction will do (12.8/159)*30 or 2.4 page-ins in 12.8 transaction can be found by calculating the ratio of
seconds. pago-outs to pago-ins (35.17/35.12=1.001) and multIplying
the page-ins per transaction by this ratio. The ratio
Onci the page-ins per non-trivial transaction has been will usually be very close to 1 since to page-in a page,
• sti~ated. the page-ins per trivial transaction can be it must have been written out sometime 1n the past .
c.lculated. Non-trivial TSO completes l.51 tranut:tions
per second. If each one does 2.4 p.ge-ins, than non-
trivi.l 1S0 accounts for 1.51*2.4=3.63 page-ins I)er sec-
ond. Since TSO was assigned 33.05 page-ins per second, P A GIN G ACT I V I T Y
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
- 199 -
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
transactions do not cause VIa paging. Therefore, all TSO
VIO paging ;s charged to non-trivial transactions
VID PAGE-IN OELAY TIME
= SUM STOR,VIO I VIO PINSISEC
VIO PAGE-OUTS PER TRANSACTION
SWAP-IN DELAY TIME
RMF MONITOR I PAGING ACTIVITY
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
- BOO •
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
I cuti on will lIencounter" a CPU ut; 1; za t i on composed of
I higher AND equal priority work; lower priority work will
MI not interfere. From the IPS for the example workload,
SI the relative dispatching priorities placed the workload
I S components in the following order: trivial TSO, non-
oI S trivial TSO, batch. Therefore, the priority scheme for
EI S the example work.load would be: 1) all Ilother" CPU time
l I across all workload components; Z) TCB+SRB time for
AI trlvl.1 TSO; 3) TCB+SRB time for non-trivi.l TSO; 4)
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
S = SWAP-IN DELAY
P = PAGE-IN DELAY
V = VIO PAGE-IN DELAY PRIORITY SCHEME
1) ANYBOOY'S "OTHER" CPU TIME
2) TRIVIAL'S TCB+SRB TIME
Database I/O delay time curves cln be gener.ted using 3) NON-TRIVIAL'S TCB+SRB TIME
only RMF Monitor I data. An average response tilH 4) BATCH'S TCB+SRB TIME
weighted by activity rate can be calculated from the DI-
RECT ACCESS DEVICE ACTIVITY REPORT. Plotting thi, tim.
against the total database 1/0 rate produces the curve. TIME %CPU SERVER%
The delay time curves allow the resource requirements of OTHER 2379.5 33.0 33.0
the workload components to be placed "1n-context'~ with TRIV T+S 335.B 4,7 37.7
the ability of the system to deliver the resource in a ·TRIV T+S 1337.7 18.6 56.3
responsive fashton. The sensitivities of I co.ponent to BATCH T+S 971.4 13.5 69.B
fluctuations in load Cln be understood.
SERV TIME (ST) = CPU TIME / TRANS
CPU QUEUING QUEUE TIME = F(#SERVERS,SERVER%) • ST
ANALYTIC MULTISERVER
QUEUING FORMULA A .ultiserver queuing formula can be obtained from any
queuing theory text. In this manner. the sensitivities
of the CPU requirements of each component and their re-
APPROXIMATE PRIORITY lationships to .ach other can be understood.
DISPATCHING BY QUEUING FOR
SERVER UTILIZATION FROM
WORK >= PRIORITY OF WORKLOAD
COMPONENT. SUMMARY
. eo, .
Find a CMG regional meeting near you at www.cmg.org/regions