0% found this document useful (0 votes)

161 views24 pages

Db2mon Education Internal

Uploaded by

kanmanda1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

161 views24 pages

Db2mon Education Internal

Uploaded by

kanmanda1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

db2mon

Kostas Rakopoulos
kostasr@ca.ibm.com

May 2020
Why did we create another monitoring tool?
●
Table functions are the recommended approach to monitoring (v9.7+)
●
Snapshots are deprecated
– New metrics are only being added to table functions
●
Table functions mean we need to write SQL queries to view metrics
– Some built-in options like MONREPORT.DBSUMMARY – but not enough
●
Most metrics are cumulative (since database activation)
– No way reset?!
– Need a way to view metrics for a given period of time
●
DSM uses table functions. Why not use DSM?
– It’s not practical for command line (scripts) usage or debugging customer issues remotely

2
Before db2mon there was psmon
●
New pureScale metrics in v9.8 were only available through table functions
– Some in db2pd but it was messy
●
Created psmon to report table function metrics specific to debugging pS
performance problems
●
Sufficient overlap between debugging pS and non-pS OLTP performance
problems
– psmon started to gain traction for analyzing OLTP issues in general
●
Eventually more metrics were added for debugging performance issues beyond
just OLTP
– Renamed psmon to db2mon to reflect this

3
How does it work?
●
Set of reporting queries that run on delta tables
●
Delta tables?
– Remember: No way to reset table functions stats
– We take before & after samples and store the delta in tables
●
Online mode (runs on Db2 server being analyzed)
– Use DGTTs to store before, after and delta metrics
– Report queries are run against DGTTs
●
Offline mode (only EXPORT command is run on Db2 server being analyzed)
– Metrics exported to IXF files
– On another system: Run scripts to IMPORT, compute deltas and run report queries

4
Design
●
Goals
– Lightweight – cannot impact performance of system being monitored
– Clean – do not create objects or otherwise pollute the system being monitored
– Backward compatibility – work with all supported versions of Db2
– Flexible – easily modifiable (such as adding new reporting queries)
●
Perl script which generates SQL to collect data and create a report
– User specifies Db2 version and generated SQL is compatible for that version
– Also generates script for simple online mode execution (db2mon.sh)
●
Creates (and removes) buffer pool and temp tablespace for DGTTs to isolate db2mon SQL
from other activities

5
Use Cases
●
Understand a customer’s workload/environment
– Saves you from having to ask basic questions about
●
Application: read/write ratio, types of statements, number of connections
●
Instance: topology, db2set, dbm cfg, db cfg
●
System: OS type, CPUs, memory
●
Find system bottlenecks
●
Find opportunities to tune the system (e.g., during a health check)
●
As part of a customer’s monitoring infrastructure
– Identify “What changed?” with before & after db2mon data

6
List of Reports - Lots of data!
Point in Time (SQL, Lock waits, Utilities) Table stats pureScale stats
START#EXSQL: Currently executing SQL at start of capture (non-zero metrics only TBL#ROWMC: Various table level metrics CF#GBPIO: Group bufferpool IO statistics by tablespace
START#LOCKW: Current lock waits at start of capture TBL#DATSH: Data sharing metrics CF#GBPHR: Group bufferpool data and index hit ratios
START#EXUTL: Currently executing utilities at start of capture CF#GBPIV: Group bufferpool invalid page statistics
END#EXSQL: Currently executing SQL at end of capture (non-zero metrics only
END#LOCKW: Current lock waits at end of capture
DB#SIZE: Size of database
TSP#SIZE: Tablespace properties
Tbsp stats CF#GBPDP: Tablespace data page prefetching statistics for group bufferpool
CF#GBPIP: Tablespace index page prefetching statistics for group bufferpool
END#EXUTL: Currently executing utilities at end of capture TSP#USAGE: Tablespace usage over monitoring interval CF#GBPFL: Count of group bufferpool full conditions
BPL#STATS: Bufferpool statistics by tablespace PAG#RCM: Page reclaim metrics for index and data pages
DB#THRUP: Throughput metrics at database level TSP#PRFST: Tablespace prefetching statistics PAG#RCSMP: Page reclaim metrics for SMP pages
DB#CLACT: Client activity (active connections have at least 1 stmt/s) TSP#BPMAP: Tablespace to bufferpool mapping CF#RTTIM: Round-trip CF command execution counts and average response times
DB#TIMEB: Time breakdown at database level (wait + processing) CF#CMDCT: Aggregate CF command execution counts
DB#WAITT: Wait times at database level BPL#SIZES: Bufferpool sizes
DB#PROCT: Processing times at database level DB level BP stats
BPL#HITRA: Bufferpool data and index hit ratios
CF#CMDTM: CF-side command execution counts and average response times
CF#CMDTO: CF-side total command execution counts
DB#SORT: Sort metrics at database level BPL#READS: Bufferpool read statistics (overall) CF#SYSRE: CF system resource information
SQL#TOPEXECT: Top SQL statements by execution time SQL stats BPL#RDSYNC: Bufferpool read statistics (synchronous reads)
BPL#RDASYNC: Bufferpool read statistics (asynchronous reads)
CF#SIZE: CF structure size information
SQL#TOPEXECP: Top SQL statements by execution time, aggregated by PLANID BPL#WRITE: Bufferpool write statistics (overall) BLU#PAGDI: Partial early aggregation / distincts
PKG#EXECT: Time spent executing by package BPL#WRSYNC: Bufferpool write statistics (synchronous writes)
SQL#TOPWAITT: Wait time breakdown for top SQL statements by execution time
SQL#TOPWAITW: Top SQL statements by time spent waiting
BPL#WRASYNC: Bufferpool write statistics (asynchronous writes) BLU: PAE/Distincts
SQL#TOPIOSTA: IO statistics per stmt - top statements by execution time
SQL#TOPROWS: Row level statistics per stmt - top statements by execution time
CON#WAITT: Wait times at connection level
CON#STATS: Various metrics at connection level
Connection level
SQL#TOPSORT: Sort statistics per stmt - top statements by execution time CON#PAGRW: Physical and logical page reads and writes at connection level
INF#EXPLN: Statement & plan identifiers - top statements by execution time

DB#SYSRE: Database system resource usage information OS, CPUs,

WLB#SLIST: Workload balancing server list
Workload Balancing
CFG#REGVA: DB2 registry variable settings
DB#LOGWR: Database log write times Memory
DB#LOGRE: Database log read times Logging
CFG#DB: Database configuration settings
CFG#DBM: Database manager configuration settings Config (db2set, db, dbm)
DB#LOGST: Other database log statistics INS#INFO: Instance information
Tbsp I/O
TSP#DSKIO: Disk read and write I/O times
TSP#DSKIOSYNC: Disk read and write I/O times (synchronous)
DB#MEMST: Database memory set information @ end
DB#MEMPL: Memory pool information @ end
Memory
TSP#DSKIOASYNC: Disk read and write I/O times (asynchronous)
DB#SEQIN: Sequences information
DB#EXTBM: External table metrics External Table Sequences
Index stats coming soon
LTC#WAITT: Latch wait metrics Latches
DB#DLCKS: Deadlocks, lock timeouts and lock escalations
7
Deadlocks
Must Read Reports
●
There are lot of reports and we don’t usually read them all
●
But there are some reports that we should always look at
●
Must read reports
– Database Throughput
– Client Activity
– Database Time Breakdown
– Database Wait Time Breakdown
– Top SQL Statements by Execution Time
– SQL Statement Wait Time Breakdown

8
Database Throughput

Monitor interval Read/Write Ratio

in seconds Includes
temporary tables
SEL_P_S / (UID_P_S + SEL_P_S)

================================================
DB#THRUP: Throughput metrics at database level
================================================

TS_DELTA MEMBER ACT_PER_S CMT_PER_S RB_PER_S DDLCK_PER_S SEL_P_S UID_P_S ROWS_INS_P_S ROWS_UPD_P_S ROWS_RET_P_S ROWS_MOD_P_S PKG_CACHE_INS_P_S P_RD_PER_S
----------- ------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ----------------- ------------
33 0 148589.4 24633.7 24.3 0.0 72455.2 25277.4 15663.8 19337.9 123311.8 80166.6 1.0 10148.6

Unless ACT_PER_S / Compare with Unless RB_PER_S / Unless < 10

CMT_PER_S >= 5 application CMT_PER_S << 1
throughput claims
Too frequent commits Too many query compiles?
High rollback frequency
Auto-commit enabled?

9
Client Activity
Connection Ratio of request
considered active if it time vs. client wait
executes at least 1 for active clients
statement per second

=======================================================================
DB#CLACT: Client activity (active connections have at least 1 stmt/s)
=======================================================================

MEMBER TOTAL_CLIENTS TOTAL_CIWT TOTAL_RQST TOT_CIWT_RQ_RATIO ACTIVE_CLIENTS ACTIVE_RQ_PER_S ACTIVE_CIWT ACTIVE_RQST ACTIVE_CIWT_RQ_RATIO
------ ------------- -------------------- -------------------- ----------------- -------------- --------------- -------------------- -------------------- --------------------
0 268 1641303 7196998 0.22 267 45697.36 1608219 7196998 0.22

Prefer to look at
Ideally close to zero
active clients as
many connections
Investigate if > 10
may just be idle

10
Database Time Breakdown

Ideally 100% in Times reported as

section and/or percentages (PCT_*)
column so easier to interpret
processing

================================================================
DB#TIMEB: Time breakdown at database level (wait + processing)
================================================================

MEMBER TOTAL_RQST_TM PCT_COMPILE PCT_SECTION PCT_SORT PCT_COL PCT_COL_SYNOP PCT_COMMIT PCT_RBACK PCT_CONN PCT_RTN_USR_CODE PCT_BACKUP PCT_IDX_BLD PCT_RUNSTATS PCT_REORG PCT_LOAD
------ ------------- ----------- ----------- -------- ------- ------------- ---------- --------- -------- ---------------- ---------- ----------- ------------ --------- --------
0 7193923 0.00 80.78 0.41 0.00 0.00 17.18 0.00 0.00 1.10 0.00 0.00 0.00 0.00 0.00

Unless < 10 Unless < 30

Too many query compiles Investigate further

11
Database Wait Time Breakdown

Times reported as Very useful high level wait Remember these metrics are
Look at
percentages (PCT_*) percentages for latching, locking, aggregated at the database level
PCT_RQST_WAIT first
so easier to interpret buffer pool physical read, LOB meaning individual statements may
reads/write, etc. be less or more severely impacted

========================================
DB#WAITT: Wait times at database level
========================================

MEMBER TOTAL_RQST_TM TOTAL_WAIT_TM PCT_RQST_WAIT PCT_LOCK PCT_GLB_LOCK PCT_LTCH PCT_LG_DSK PCT_LG_BUF PCT_RCLM PCT_CF PCT_PFTCH PCT_DIAG PCT_POOL_R PCT_DIR_R PCT_DIR_W PCT_FCM PCT_TCPIP
------ ------------- ------------- ------------- -------- ------------ -------- ---------- ---------- -------- ------- --------- -------- ---------- --------- --------- ------- ---------
0 272833 265769 97.41 0.00 0.00 0.35 0.04 0.00 0.00 0.00 0.00 2.05 0.00 0.03 0.02 93.11 0.03
1 7193923 2107537 29.29 0.24 0.00 0.37 16.49 0.00 0.00 0.00 0.00 0.00 11.74 0.18 0.00 0.00 0.23

For member 1, logging and buffer pool

For wait time hierarchy see Db2 reads are contributing to nearly 30% wait High PCT_FCM is common in
docs section: combined environments with a lot of TQ operators
(e.g., DPF, BLU, intra parallel)
"Time-spent monitor element Based on this would want to check log write
hierarchy" times and statements with high pool read So while this 93% wait in FCM seems
times bad, it is normally only a symptom of
some other latency in processing and
But neither one seems to be a major not a problem itself 12
bottleneck at this time
Top SQL Statements by Execution Time
Why COORD exec time? Why not just plain exec time?
Sorted by total coord Other reports sort by wait time so you can easily
exec time find statements dominated by wait time Because of DPF/MPP environments where the work is done via a
coord agent. With plain exec time we don’t get the total query
execution time (as seen by the coord agent).

====================================================
SQL#TOPEXECT: Top SQL statements by execution time
====================================================

MEMBER NUM_EXEC COORD_STMT_EXEC_TIME AVG_COORD_EXEC_TIME PCT_COORD_STMT_EXEC_TIME TOTAL_CPU_TIME AVG_CPU_TIME PCT_WAIT_TIME AVG_SECT_TIME AVG_COL_TIME STMT_TEXT
------ ----------- -------------------- ------------------- ------------------------ -------------------- -------------------- ------------- ------------- ------------ -----------------------------------
0 40134 3535865 88.10 28.56 3149789 78 3.19 0.03 0.00 CALL BROKERVOLUME_F1(?, ?, ?, ?, ?)
0 40139 3529657 87.93 28.51 3526253901 87851 0.02 87.93 0.00 SELECT ARRAY_AGG(B_NAME), ARRAY_AGG(
0 147325 632104 4.29 5.10 69830522 473 0.58 0.01 0.00 CALL MARKETWATCH_F1(?, ?, ?, ?, ?, ?
0 88301 570747 6.46 4.61 575044790 6512 0.00 6.46 0.00 SET :HV00014 :HI00014 = ( SELECT 100
0 78118 538322 6.89 4.34 3502808 44 97.87 0.01 0.00 CALL TRADERESULT_F6(?, ?, ?, ?, ?
0 155401 428013 2.75 3.45 9905143 63 0.25 0.04 0.00 CALL TRADESTATUS_F1(?, ?, ?, ?, ?
0 155409 402792 2.59 3.25 391322528 2518 2.81 2.59 0.00 SELECT ARRAY_AGG(T_ID), ARRAY_AGG
0 19683 258882 13.15 2.09 1492058 75 0.11 0.03 0.00 CALL TRADELOOKUP_F2(?, ?, ?, ?, ?, ?
0 19429 207988 10.70 1.68 1489319 76 0.15 0.04 0.00 CALL TRADELOOKUP_F3(?, ?, ?, ?, ?, ?
0 19682 205605 10.44 1.66 14673734 745 94.66 10.44 0.00 SELECT ARRAY_AGG(T_ID), ARRAY_AGG
0 19422 169865 8.74 1.37 16100432 828 92.28 8.74 0.00 SELECT ARRAY_AGG(T_ID), ARRAY_AGG
0 5428 112541 20.73 0.90 794506 146 97.23 0.04 0.00 CALL TRADEUPDATE_F2(?, ?, ?, ?, ?
0 19671 105273 5.35 0.85 1328777 67 0.08 0.04 0.00 CALL TRADELOOKUP_F1(?, ?, ?, ?, ?

Percent of total
Average coord CPU time in
coord exec time See which statements have
exec time microseconds
(across all queries) high percent (>30) wait time

13
SQL Statement Wait Time Breakdown
EXECUTABLE_ID for each
Look at PCT_WAIT
statement is available in later
first
sections of the db2mon report. Useful
for collecting access plan or
retrieving full statement text

============================================================================
SQL#TOPWAITT: Wait time breakdown for top SQL statements by execution time
============================================================================

MEMBER PCT_WAIT PCT_LG_DSK PCT_LG_BUF PCT_LOCK PCT_GLB_LOCK PCT_LTCH PCT_RCLM PCT_CF PCT_PFTCH PCT_DIAG PCT_POOL_R PCT_DIR_R PCT_DIR_W PCT_FCM STMT_TEXT
------ -------- ---------- ---------- -------- ------------ -------- -------- ------- --------- -------- ---------- --------- --------- ------- --------------------------------------------------------------
0 3.19 0.00 0.00 0.00 0.00 3.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL BROKERVOLUME_F1(?, ?, ?, ?, ?)
0 0.02 0.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 SELECT ARRAY_AGG(B_NAME), ARRAY_AGG(CAST(VOLUME AS DECI
0 0.58 0.00 0.00 0.00 0.00 0.58 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL MARKETWATCH_F1(?, ?, ?, ?, ?, ?, ?)
0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 SET :HV00014 :HI00014 = ( SELECT 100 * ( ( CAST
0 97.87 96.14 0.00 0.00 0.00 1.73 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL TRADERESULT_F6(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
0 0.25 0.00 0.00 0.00 0.00 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL TRADESTATUS_F1(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
0 2.81 0.00 0.00 0.00 0.00 0.33 0.00 0.00 0.00 0.00 2.47 0.00 0.00 0.00 SELECT ARRAY_AGG(T_ID), ARRAY_AGG(T_DTS), ARRAY_AGG(ST_NAME),
0 0.11 0.00 0.00 0.00 0.00 0.07 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL TRADELOOKUP_F2(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
0 0.15 0.00 0.00 0.00 0.00 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL TRADELOOKUP_F3(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
0 94.66 0.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 94.63 0.00 0.00 0.00 SELECT ARRAY_AGG(T_ID), ARRAY_AGG(T_BID_PRICE), ARRAY_AGG
0 92.28 0.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 92.25 0.00 0.00 0.00 SELECT ARRAY_AGG(T_ID), ARRAY_AGG(T_CA_ID), A
0 97.23 88.90 0.00 0.00 0.00 8.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL TRADEUPDATE_F2(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
0 0.08 0.00 0.00 0.00 0.00 0.08 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CALL TRADELOOKUP_F1(?, ?, ?, ?, ?, ?, ?, ?,

Log disk Pool read What Next?

PCT_WAIT > 30 dominates wait dominates wait
time for this time for this PCT_LG_DISK: Check log write times (in db2mon)
Typically worth investigating query query
PCT_POOL_R: Check pool read times, pool hit ratios
and access plan (in db2mon except for access plan) 14
Find Common Bottlenecks
Bottleneck DB, SQL and Connection Level Things to Check Possible Next Steps

PCT_LG_DISK Log write time in DB#LOGWR Investigate log disk file system and storage
Slow Log Writes subsystem
PCT_LTCH Hot latches in LTC#WAITT db2latchtrace, stacks, check access plan
Latch Contention Are particular statements affected? Bad plan?

Disk I/O PCT_POOL_R BP sizes in BPL#SIZES Increase BP size, inline LOBs, check access
(Buffer pool, PCT_DIR_R BP hit ratios in BPL#HITRA plan, check underlying storage
LOBs) PCT_DIR_W Are particular statements affected? Bad plan?

Expensive AVG_COORD_EXEC_TIME Rows read / rows returned ratio in Check access plan, CDE perf trace (if BLU),
Queries (High PCT_COORD_STMT_EXEC_TIME SQL#TOPROWS stacks, check for spilling
CPU, Sorting, AVG_CPU_TIME Sort stats in SQL#TOPSORT
Spilling) PCT_WAIT_TIME (could be low!) High average logical reads in SQL#TOPIOSTA

PCT_LOCK Point in time lock waits in START#LOCKW and Tune application (e.g., isolation level)
Lock Contention END#LOCKW Check access plan
PCT_RCLM Find what tables (or indexes) are causing reclaims Enable append mode, tune index, table
Page Reclaims High PCT_LTCH with non-zero from PAG#RCM and PAG#RCSMP partitioning, EHL
PCT_RCLM
PCT_CF CF round trip command times in CF#RTTIM Increase CF workers
CF side command times in CF#CMDTM Reduce CF traffic (EHL)
CF Wait Need RDMA if using TCPIP transport

15
Find Common Bottlenecks (Examples)
========================================
Wait times at database level - percents
======================================== What’s “high” for a log
MEMBER TOTAL_RQST_TIME_MS TOTAL_WAIT_TIME_MS RQST_WAIT LOCK GLB_LOCK LATCH LOG_DSK RECLAIM CF POOL_R DIR_R DIR_W write time?
FCM TCPIP DIAG
------ ------------------ ------------------ --------- ------- -------- ------- ------- ------- ------- ------- ------- ------- ------- ------- -------
0 1171250 818956 69.92 1.23 0.85 10.24 36.63 7.30 2.98 0.33 2.35 6.66 0.63 1.37 0.00
1 1178600 1121127 95.12 1.09 0.36 16.66 46.83 5.72 9.16 0.34 3.17 8.75 Higher than 1
1.06 2.21 0.00

Slow log writes . . .

==========================
Database log write times
millisecond can be an
issue but it depends.
==========================
If you don’t see a high
MEMBER NUM_LOG_WRITE_IO LOG_WRITE_IO_PER_S LOG_WRITE_MB_PER_S LOG_WRITE_TIME percent wait for log disk
LOG_WRITE_TIME_PER_IO_MS NUM_LOG_BUFFER_FULL

0 64975 214.4389 0.8349 179470 5.7621

I/O then it might not be
------ -------------------- ------------------ ------------------ -------------------- ------------------------ --------------------
0
1 59409 196.0693 0.7656 191576 6.2246 your main bottleneck.
0

Latch contention
========================================
What next?
DB#WAITT: Wait times at database level
========================================
Based on latch name of top few
latches you might be able to
MEMBER TOTAL_RQST_TM TOTAL_WAIT_TM PCT_RQST_WAIT PCT_LOCK PCT_GLB_LOCK PCT_LTCH PCT_LG_DSK PCT_LG_BUF PCT_RCLM PCT_CF PCT_PFTCH PCT_DIAG PCT_POOL_R PCT_DIR_R PCT_DIR_W PCT_FCM PCT_TCPIP
------ ------------- ------------- ------------- -------- ------------ -------- ---------- ---------- -------- ------- --------- -------- ---------- --------- --------- ------- ---------
0 4850728 4598430 94.79 8.36 0.00 84.17 1.00 0.00 0.00 0.00 0.00 0.00 1.17 0.00 guess but if not, get stacks. Might
0.00 0.00 0.06

. . . also help to look at what

===============================
statements have high latch wait.
LTC#WAITT: Latch wait metrics
===============================
In this case (memlatch) we need
MEMBER
------
LATCH_NAME TOT_EXT_LATCH_WAIT_TIME_MS TOT_EXT_LATCH_WAITS TIME_PER_LATCH_WAIT_MS
------------------------------------------------------------ -------------------------- -------------------- ----------------------
to get stacks since it could be
0 SQLO_LT_SMemPool__MemLatchType__latch 3069940 519880 5.90 anything.
0 SQLO_LT_SQLB_BPD__bpdLatch_SX 65870 209 315.16
0 SQLO_LT_SMemSet__MemLatchType__latch 27250 25702 1.06
0 SQLO_LT_SQLB_DIRTY_LIST_SET__walkLatch 1860 424 4.38
0 SQLO_LT_SQLB_DIRTY_LIST_SET__appendLatch 367 63 5.82 16
0 SQLO_LT_stmmSort__statsLatch 336 112 3.00
0 SQLO_LT_SQLP_LHSH__hshlatch 118 3607 0.03
Find Common Bottlenecks (More examples)
Slow disk reads PCT_POOL_R >> 30 Why high PCT_LTCH?

It’s a symptom of the slow disk reads. We must

============================================================================
SQL#TOPWAITT: Wait time breakdown for top SQL statements by execution time
hold a BPD latch while reading from disk.
============================================================================

MEMBER PCT_WAIT PCT_LG_DSK PCT_LG_BUF PCT_LOCK PCT_GLB_LOCK PCT_LTCH PCT_RCLM PCT_CF PCT_PFTCH PCT_DIAG PCT_POOL_R PCT_DIR_R PCT_DIR_W PCT_FCM STMT_TEXT
------ -------- ---------- ---------- -------- ------------ -------- -------- ------- --------- -------- ---------- --------- --------- ------- ----------------------------------------
0 99.95 0.00 0.00 0.00 0.00 99.54 0.00 0.00 0.00 0.00 0.45 0.00 0.00 0.00 Insert into NEW_ORDER values (3132, 6, 1
0 100.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Insert into ORDER_LINE values (3116, 6,
0 100.00 0.00 0.00 0.00 0.00 93.75 0.00 0.00 0.00 0.00 6.19 0.00 0.00 0.00 Insert into ORDER_LINE values (3136, 3,
0 100.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Select D_NAME, D_STREET_1, D_STREET_2, D
0 99.66 0.00 0.00 0.00 0.00 74.91 0.00 0.00 0.00 0.00 25.08 0.00 0.00 0.00 Insert into ORDERS values (3136, 1007, 3
0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 Select C_LAST, C_CREDIT, C_DISCOUNT, C_B
0 99.22 0.00 0.00 0.00 0.00 77.51 0.00 0.00 0.00 0.00 21.70 0.00 0.00 0.00 Insert into ORDER_LINE values (3116, 6,
0 99.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.02 0.00 0.00 0.00 Select S_QUANTITY, S_DIST_01, S_DIST_02,
0 98.90 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.90 0.00 0.00 0.00 Update ORDER_LINE set OL_DELIVERY_D = '2
0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 Select S_QUANTITY, S_DIST_01, S_DIST_02,
0 98.64 0.00 0.00 0.00 0.00 98.64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Insert into ORDER_LINE values (3122, 3,
0 98.55 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.55 0.00 0.00 0.00 Insert into ORDERS values (3132, 503, 6,

What’s “high” for a disk read time? What next?

==========================================
TSP#DSKIO: Disk read and write I/O times
==========================================
Under 5 ms would be ideal Check the buffer pool size.
MEMBER
------
TBSP_NAME NUM_READS AVG_READ_TIME DIRECT_READ_REQS 15 ms or higher is problematic
AVG_DRCT_READ_TIME NUM_WRITES AVG_WRITE_TIME DIRECT_WRITE_REQS AVG_DRCT_WRITE_TIME
-------------------- -------------------- ------------- -------------------- ------------------ -------------------- -------------- -------------------- -------------------
0 TBS_STKX 2612 23.92 0 - 0 - Check the file system and disk
0 -
0
0
TBS_STK
TEMPSPACE1
1878
0
23.88
-
0 -
But it depends – how large is the wait time
348 0.00
434
2
85.70
0.00
subsystem for issues. Check things
0
348
-
0.00
0
0
TBS_SML
TBS_NOX
439
204
11.79
25.32 relative to statement execution time?
0
0
-
-
29
151
69.17
50.55
like disk cache and queue depth.
0
0
-
-
0 TBS_OLX 274 29.20 0 - 66 55.69 0 -
0
0
TBS_OX
TBS_CSTX
220
265
26.15
36.94
0 -
In this case it’s pretty severe (> 99%)
0 -
93
0
114.00
- Determine what the disk
0
0
-
-
0
0
TBS_CST
TBS_OL
137
88
42.78
52.50
0
0
-
-
114
113
50.51
71.78
requirements for the workload are
0
0
-
-
and compare that to what the disk 17
subsystem is capable of.
Find Common Bottlenecks (More examples)
PCT_COORD_STMT_EXEC_TIME > 30

Expensive queries Top CPU consumer (TOTAL_CPU_TIME)

====================================================
AVG_COORD_EXEC_TIME / (AVG_CPU_TIME / 1000) close to 1
SQL#TOPEXECT: Top SQL statements by execution time which means this query is CPU bound
====================================================

===============================================================================
SQL#TOPROWS: Row level statistics per stmt - top statements by execution time
===============================================================================

MEMBER NUM_EXEC AVG_ROWS_MOD AVG_ROWS_READ AVG_ROWS_RET COL_SYNOPSIS_ROWS_INSERTED STMT_TEXT

------ ----------- -------------- -------------- -------------- -------------------------- ----------------------------------------------------------------------------------------------
0 10622 0.0 100000.0 1.0 0 Select I_NAME, I_PRICE, I_DATA from ITEM where I_ID = ?
0 10623 0.0 1.0 1.0 0 Select S_QUANTITY, S_DIST_01, S_DIST_02, S_DIST_03, S_DIST_04, S_DIST_05, S_DIST_06, S_DIST_07
0 99 0.0 201.2 1.0 0 Select Count(Distinct S_I_ID) from STOCK, ORDER_LINE where (S_W_ID = ? and S_QUANTITY < ? and
0 693 0.0 3000.0 3.4 0 Select C_ID, C_FIRST from CUSTOMER where (C_W_ID = ? and C_D_ID = ? and C_LAST = ?) ORDER BY

What Next?
AVG_ROWS_READ / AVG_ROWS_RET >> 1
AVG_D_LRD >> AVG_I_LRD
AVG_I_LRD is zero
=========================================================================
SQL#TOPIOSTA: IO statistics per stmt - top statements by execution time
=========================================================================
Probably no index! Check the plan
MEMBER NUM_EXEC AVG_D_LRD AVG_D_PRD AVG_I_LRD AVG_I_PRD AVG_TD_PRD AVG_TI_PRD AVG_COL_LRD AVG_COL_PRD AVG_DIR_R_RQS AVG_DIR_W_RQS STMT_TEXT
------ ----------- ------------------ ------------ ------------------ ------------ ------------ ------------ ------------------ ------------ ------------- ------------- ----------------
0 10622 2212.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Select I_NAME,
0 10623 1.8 0.8 4.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 Select S_QUANTITY 18
0 99 286.6 85.4 831.2 17.5 0.0 0.0 0.0 0.0 0.0 0.0 Select Count(Dist
0 693 3000.0 0.0 25.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Select C_ID,
A Word on Access Plans
●
Sometimes an application can be configured or do things such that the plans
the application gets do not match what you get via CLP
– Rare, but when it happens it can be very annoying and difficult to debug
●
The EXPLAIN_FROM_SECTION stored procedure is the most accurate way
to retrieve access plans (will be exactly what the application got)
●
Uses the EXECUTABLE_ID of the statement (from
MON_GET_PKG_CACHE_STMT)
– Need statement to still be in package cache
●
db2mon provides a report (INF#EXPLN) with the EXECUTABLE_IDs for the
top SQL statements

19
Looking for a specific metric?
●
There are a lot of metrics reported by db2mon
– Too many to list
– And growing as we slowly fill in gaps where the tool is lacking
●
grep db2mon.pl for the metric name
●
If you don’t find it open an issue or pull request to have it
included

20
Pain Points
●
Long running queries
– Most metrics are not rolled up until the statement completes
– Need to collect MON_GET_PKG_CACHE_STMT after query completes
●
Sometimes the query never completes!
– Would need to rely on “in-flight” metrics from active SQL monitors
– Normally end up having to dump stacks
●
No metrics for BLU evaluators
●
No sort or join spill metrics
●
No WLM metrics

21
Requirements
●
All versions of Db2 LUW are supported starting from v9.7
●
User temp tablespace is required for DGTTs
– Created by db2mon.sh
●
Database monitoring switches must be BASE or EXTENDED
– MON_REQ_METRICS *****************************
db2mon tries to warn
about missing prereqs
Checking db2mon prerequisites
at the top of the report
– MON_ACT_METRICS *****************************

– MON_OBJ_METRICS db2mon Prerequisite

-------------------------
Status
-----------------------
User temp tablespace? OK (1 defined)
●
Perl is required to run db2mon.pl MON_REQ_METRICS correct?
MON_ACT_METRICS correct?
OK (currently BASE)
OK (currently BASE)
MON_OBJ_METRICS correct? OK (currently EXTENDED)
– This can be done on any machine
– Copy the generated files to the system to be monitored

22
Where can I find it?
●
Stable version part of the Db2 source tree starting in v11.1
●
Can find it on customer systems on sqllib/samples/perf
●
Experimental version on git
– https://github.ibm.com/DB2-Performance/db2mon
– Everyone is welcome to open issues and or submit PRs!
●
Documentation is available
– Db2 Knowledge Center
●
“Collecting and reporting performance monitor data with db2mon”
– In the git repo above

23
Introducing Db2 Performance Bootcamp
●
Designed to help people gain experience debugging performance
problems
– Through exercises that simulate common performance problems
●
Work in progress (only 3 exercises so far)
●
Creates a VM, installs Db2, builds a database and deploys a workload
●
Great place to practice with db2mon
●
https://github.ibm.com/DB2-Performance/bootcamp
– Feel free to open issues or contribute through pull requests

Handling Performance PMRs
No ratings yet
Handling Performance PMRs
40 pages
Farmers Query Monitor: IBM Software Group
No ratings yet
Farmers Query Monitor: IBM Software Group
36 pages
Advanced Monitoring
No ratings yet
Advanced Monitoring
41 pages
SQL Server Health Checklist
No ratings yet
SQL Server Health Checklist
26 pages
Db2Monitoring v1.0.2
No ratings yet
Db2Monitoring v1.0.2
42 pages
ISASBP Query Optimization 0611
No ratings yet
ISASBP Query Optimization 0611
52 pages
DB2 LUW 10 Star Schema and MCP
No ratings yet
DB2 LUW 10 Star Schema and MCP
46 pages
DB2 SQL Tuning Best Practices
No ratings yet
DB2 SQL Tuning Best Practices
22 pages
DB2 LUW V11 Certification Training - Part 2
No ratings yet
DB2 LUW V11 Certification Training - Part 2
30 pages
Best Practices DB2 BLU Acceleration Für SAP - Olaf Depper
No ratings yet
Best Practices DB2 BLU Acceleration Für SAP - Olaf Depper
67 pages
DB2 Basics for CS Students
No ratings yet
DB2 Basics for CS Students
31 pages
Performance Tuning Guide: DB2 UDB V7.1
No ratings yet
Performance Tuning Guide: DB2 UDB V7.1
418 pages
DB2 Course: Functions, SQL, and Data Management
No ratings yet
DB2 Course: Functions, SQL, and Data Management
137 pages
AU14C04-Codepages and DB2
No ratings yet
AU14C04-Codepages and DB2
33 pages
Database System Concepts Guide
No ratings yet
Database System Concepts Guide
45 pages
Optimize SQL with db2batch Tool
No ratings yet
Optimize SQL with db2batch Tool
13 pages
Practical DB2 UDB Performance: John Maenpaa
0% (1)
Practical DB2 UDB Performance: John Maenpaa
51 pages
DB2 SQL Optimization Guide
No ratings yet
DB2 SQL Optimization Guide
70 pages
DB2 Administration Guide Implementation
No ratings yet
DB2 Administration Guide Implementation
487 pages
Enterprise Database Security & Monitoring: Alfred Horng IBM Software Group
No ratings yet
Enterprise Database Security & Monitoring: Alfred Horng IBM Software Group
19 pages
Advanced SQL Tuning with IBM Data Studio
No ratings yet
Advanced SQL Tuning with IBM Data Studio
59 pages
SQL Update For Db2
No ratings yet
SQL Update For Db2
38 pages
DB2 Architecture
No ratings yet
DB2 Architecture
25 pages
DB2 Tech Talk PureData Systems Presentation PDF
No ratings yet
DB2 Tech Talk PureData Systems Presentation PDF
65 pages
Understanding Table Queues
No ratings yet
Understanding Table Queues
56 pages
DB2 Security
No ratings yet
DB2 Security
290 pages
Advanced Query Tuning Using IBM Data Studio
No ratings yet
Advanced Query Tuning Using IBM Data Studio
64 pages
DB2 9 Fundamentals Exam 730 Prep, Part 1
No ratings yet
DB2 9 Fundamentals Exam 730 Prep, Part 1
60 pages
10 - SQL - Tips Sheryl M Larsen
No ratings yet
10 - SQL - Tips Sheryl M Larsen
4 pages
04 - Plan Cache and Index Advice S4-4
100% (1)
04 - Plan Cache and Index Advice S4-4
63 pages
PoT IM 06 1 027 14-Workbook
No ratings yet
PoT IM 06 1 027 14-Workbook
217 pages
Ember Basics - Monitoring - Troubleshooting - Virtual
No ratings yet
Ember Basics - Monitoring - Troubleshooting - Virtual
59 pages
db2 Built-In Routines and Views 115
No ratings yet
db2 Built-In Routines and Views 115
1,670 pages
DB2BP System Performance 0813
No ratings yet
DB2BP System Performance 0813
80 pages
VLDB Management Challenges
No ratings yet
VLDB Management Challenges
6 pages
Academic Workshop Course Book
100% (1)
Academic Workshop Course Book
329 pages
Database Performance Tuning and Query Optimization: Discussion Focus
No ratings yet
Database Performance Tuning and Query Optimization: Discussion Focus
6 pages
Is CDC 1 Architecture
No ratings yet
Is CDC 1 Architecture
58 pages
Set Up Linux On IBM System Z For Production: Front Cover
No ratings yet
Set Up Linux On IBM System Z For Production: Front Cover
190 pages
SQL For The System Administrator
No ratings yet
SQL For The System Administrator
71 pages
Utilities
No ratings yet
Utilities
334 pages
db2 Dev Routines 115 PDF
No ratings yet
db2 Dev Routines 115 PDF
244 pages
Toad For DB2 Installation Guide
No ratings yet
Toad For DB2 Installation Guide
74 pages
DB2 Databases: Database Architecture
No ratings yet
DB2 Databases: Database Architecture
15 pages
Db2 Optimizer Cascon 2014
No ratings yet
Db2 Optimizer Cascon 2014
4 pages
DB2 11.1 New Features Highlight
67% (3)
DB2 11.1 New Features Highlight
64 pages
SQL Tips and Best Practices
0% (1)
SQL Tips and Best Practices
70 pages
Best Practice For DB2 On AIX 61-Sg247821
No ratings yet
Best Practice For DB2 On AIX 61-Sg247821
426 pages
What Version and Fixpack of DB2?
No ratings yet
What Version and Fixpack of DB2?
6 pages
PoT - Im.06.1.027.14 Presentation
No ratings yet
PoT - Im.06.1.027.14 Presentation
196 pages
DB2 - IBM's Relational DBMS
No ratings yet
DB2 - IBM's Relational DBMS
167 pages
DB2
No ratings yet
DB2
180 pages
Monitoring Your DB2 LUW Database With Just SQL
No ratings yet
Monitoring Your DB2 LUW Database With Just SQL
56 pages
AWR Analysis Part-1 PDF
100% (1)
AWR Analysis Part-1 PDF
24 pages
Handling Basic Oracle Database Issues
100% (1)
Handling Basic Oracle Database Issues
12 pages
Performance Tuning in Oracle 10g Feel The Power !: Kyle Hailey
No ratings yet
Performance Tuning in Oracle 10g Feel The Power !: Kyle Hailey
135 pages
Who Is Active v10 00
No ratings yet
Who Is Active v10 00
91 pages
Schema Hotsos
No ratings yet
Schema Hotsos
119 pages
Optimize Database Monitoring with ORATOP
No ratings yet
Optimize Database Monitoring with ORATOP
6 pages
Oracle 9I-Performance Tuning: (Rev1.1) - January, 2009 - by Ergem PEKER
No ratings yet
Oracle 9I-Performance Tuning: (Rev1.1) - January, 2009 - by Ergem PEKER
26 pages
Photo Scanner App Document
100% (1)
Photo Scanner App Document
99 pages
100 of 500
No ratings yet
100 of 500
3 pages
(Case Study) Whatsapp Forensics: Decrypt Encrypted Whatsapp Database Files With Salvationdata'S Free Forensic Tool
100% (1)
(Case Study) Whatsapp Forensics: Decrypt Encrypted Whatsapp Database Files With Salvationdata'S Free Forensic Tool
5 pages
Linux Admin Command Guide
No ratings yet
Linux Admin Command Guide
5 pages
Big Data & DWH-BI Architect Profile
No ratings yet
Big Data & DWH-BI Architect Profile
8 pages
The History and Future of Java Programming Language
No ratings yet
The History and Future of Java Programming Language
4 pages
ANSYS Macro Programming With The
No ratings yet
ANSYS Macro Programming With The
5 pages
Information Gathering and Processing in Retailing: Retail Management: A Strategic Approach
No ratings yet
Information Gathering and Processing in Retailing: Retail Management: A Strategic Approach
28 pages
Understanding Web Profiles
No ratings yet
Understanding Web Profiles
6 pages
CASE STUDY REPORT (Railway Management System)
No ratings yet
CASE STUDY REPORT (Railway Management System)
18 pages
Internship Report
No ratings yet
Internship Report
10 pages
Red Hat Advanced Cluster Security For Kubernetes
No ratings yet
Red Hat Advanced Cluster Security For Kubernetes
3 pages
IT350 Kennedy Unit6
No ratings yet
IT350 Kennedy Unit6
7 pages
MYID Kiosk User Guide V1.0
No ratings yet
MYID Kiosk User Guide V1.0
34 pages
PostgreSQL Solaris PKG Installation
No ratings yet
PostgreSQL Solaris PKG Installation
8 pages
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
No ratings yet
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
37 pages
Satender Kumar's CV: MCA, .Net Skills
No ratings yet
Satender Kumar's CV: MCA, .Net Skills
2 pages
BW465 EN Col20 CO A4
No ratings yet
BW465 EN Col20 CO A4
31 pages
Spooling
No ratings yet
Spooling
3 pages
System Integration & Architecture Guide
No ratings yet
System Integration & Architecture Guide
1 page
QNX Admin Book
No ratings yet
QNX Admin Book
230 pages
Tools and Techniques of SSAD
88% (8)
Tools and Techniques of SSAD
37 pages
Printix Security Privacy Guide 2024 v1
No ratings yet
Printix Security Privacy Guide 2024 v1
12 pages
Netflix Clone: Stripe & Firebase Setup
No ratings yet
Netflix Clone: Stripe & Firebase Setup
7 pages
Mobile Parcel Management (User Manual)
No ratings yet
Mobile Parcel Management (User Manual)
46 pages
Tutorial On Installing Packages - Python Packaging User Guide Documentation
No ratings yet
Tutorial On Installing Packages - Python Packaging User Guide Documentation
6 pages
2.1 - Service Models (XaaS)
No ratings yet
2.1 - Service Models (XaaS)
22 pages
Safeguards Scsem Network Assessment
No ratings yet
Safeguards Scsem Network Assessment
11 pages
PM Debug Info
No ratings yet
PM Debug Info
275 pages
Web Dev Assignment Guide
No ratings yet
Web Dev Assignment Guide
6 pages

Db2mon Education Internal

Uploaded by

Db2mon Education Internal

Uploaded by

db2mon

DB#SYSRE: Database system resource usage information OS, CPUs,

Monitor interval Read/Write Ratio

Unless ACT_PER_S / Compare with Unless RB_PER_S / Unless < 10

Ideally 100% in Times reported as

Unless < 10 Unless < 30

Too many query compiles Investigate further

For member 1, logging and buffer pool

Log disk Pool read What Next?

Slow log writes . . .

0 64975 214.4389 0.8349 179470 5.7621

. . . also help to look at what

It’s a symptom of the slow disk reads. We must

What’s “high” for a disk read time? What next?

Expensive queries Top CPU consumer (TOTAL_CPU_TIME)

MEMBER NUM_EXEC AVG_ROWS_MOD AVG_ROWS_READ AVG_ROWS_RET COL_SYNOPSIS_ROWS_INSERTED STMT_TEXT

– MON_OBJ_METRICS db2mon Prerequisite

You might also like