[go: up one dir, main page]

0% found this document useful (0 votes)
55 views45 pages

Session 9808 - IMS Perf

The document discusses monitoring IMS performance by understanding workload flow, collecting real-time and historical data, and setting alerts. It provides examples of challenges like poor response times, bottlenecks, and connection issues. Response time analysis is presented as a key tool to identify problems and outliers in transaction performance.

Uploaded by

emeka2012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views45 pages

Session 9808 - IMS Perf

The document discusses monitoring IMS performance by understanding workload flow, collecting real-time and historical data, and setting alerts. It provides examples of challenges like poor response times, bottlenecks, and connection issues. Response time analysis is presented as a key tool to identify problems and outliers in transaction performance.

Uploaded by

emeka2012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

IMS Performance - Getting The Most

Out Of Your Monitoring Technology:


Isolating And Solving Common Issues
Ed Woods
IBM Corporation

Session 9808
Tuesday, August 9th
9:30 – 10:30 AM

© 2011 IBM Corporation


0
Agenda
Understanding the workload
– IMS as part of a bigger picture
Real Time IMS monitoring examples
– Typical steps in problem analysis
Historical data collection considerations
Alerting and corrective actions
Integrated monitoring and management

1 © 2011 IBM Corporation


IMS Is Part Of A Much Bigger Picture

z/OS

Linux on z

Linux, UNIX, Windows z/VM

Web Server

Application
IMS
Server
DB2
Middleware and CICS
application layers MQ
WebSphere

 IMS works as a central component of many critical applications


 Application connectivity and flow may take many forms
 Understanding the flow helps drive monitoring requirements

2 © 2011 IBM Corporation


Understanding The Flow Of IMS Processing
What Are The Potential Bottlenecks?

CICS
Threads

Connection App Init &


bottlenecks Network execution Network
delays delays
IMSConnect
IMS Connect IMS
APPC IMS
APPC Control Message IMS Message Out
OTMA Message In Control
OTMA Region & BMP
Telnet
Telnet Region
Queues & Regions
Scheduling

Threads

Connection DB, BP
bottlenecks I/O delays Lock
Conflicts
Threads IMS
DB2 Subsystem DLI
IRLM
DB, BP Lock
I/O delays Conflicts

3 © 2011 IBM Corporation


Monitoring Information
Real Time versus Historical versus Alerts

A complete monitoring approach will commonly


require elements of each of the following:
– Real time performance and availability
• Current resource utilization, availability, and status
– Historical performance and availability
• Detailed historical performance and availability information
• Interval historical information for trending and analysis
– Alerts and Automation
• Alert notification of critical performance and availability issues
• Notification of alerts (visual or via other means)
• Automated corrective action (where appropriate)

4 © 2011 IBM Corporation


Creating A Consolidated Monitoring Strategy To
Analyze IMS Processing And Bottlenecks
Managing and analyzing IMS performance depends
upon an understanding of the flow of the workload
– What is the workload?
– What is the flow of the workload?
– Where are the potential workload bottlenecks?
– If the workload is bottlenecked, to what extent?
Build a monitoring strategy to focus on key metrics
– Transaction response time – with application grouping
– Transaction rate information at various levels
• IMS transaction response time correlated with transaction rate
• Transaction enqueue/dequeue rate at various levels
– Enqueue/dequeue rate at the system level, OTMA level, Fast Path level
– Bottleneck analysis (wait states for the system and by workload group)
– Transaction queue depth
• Queuing at the system level and the transaction level
• Queuing at other levels (FP BALG, MSC link, etc.)
– Dependent region processing (region occupancy)

5 © 2011 IBM Corporation


Examples Of Typical IMS Performance And
Availability Challenges
Poor IMS response time, trans queuing and/or bottlenecked
– IMS transactions queued
– IMS scheduling delays
– IMS application performance/system bottlenecks
IMS connection bottlenecks
– CICS/DBCTL connection bottlenecks
– Network delays
– Delays related to IMS Connect, OTMA, APPC, etc.
IMS database and subsystem delays
– IMS database delays
• High I/O, poor BP performance and IMS lock conflicts
External subsystem (DB2) delays – elongate IMS application time
– DB2 thread connection issues
– DB2 SQL delays
– DB2 database I/O delays and BP performance
– DB2 lock conflicts

6 © 2011 IBM Corporation


Examples Of Typical IMS Performance And
Availability Challenges
Poor IMS response time, trans queuing and/or bottlenecked
– IMS transactions queued
– IMS scheduling delays
– IMS application performance/system bottlenecks
IMS connection bottlenecks
– CICS/DBCTL connection bottlenecks
– Network delays
– Delays related to IMS Connect, OTMA, APPC, etc.
IMS database and subsystem delays
– IMS database delays
• High I/O, poor BP performance and IMS lock conflicts
External subsystem (DB2) delays – elongate IMS application time
– DB2 thread connection issues
– DB2 SQL delays
– DB2 database I/O delays and BP performance
– DB2 lock conflicts

7 © 2011 IBM Corporation


Understanding The Workload
Response Time Analysis

Response Time Analysis (RTA) provides critical information


on workload flow, issues, and outliers
RTA does several things
– Captures detailed response time data from IMS and stores it in
user-definable groups
• Consider grouping related workload for analysis purposes
– RTA measures queuing and service times within IMS
• Input queue time, Processing time, Output queue time
– Groups work in conjunction with Bottleneck Analysis
RTA group considerations
– Focus user-defined groups on key workload
• Loved ones and problem children

8 © 2011 IBM Corporation


Use Response Time Analysis To Understand Transaction
Performance And To Identify Potential Issues

Analyze transaction response time Input queue time


over various time intervals Processing time
Output queue time

Where is the issue?

RTA will show


transaction response
time for workload
groups, broken down
by component, and
various time
intervals. Identify tran with longest
response times

9 © 2011 IBM Corporation


Monitor The Flow Of The Workload
Use Response Time Analysis To Identify Problems And Outliers
Message counts and rates

Use RTA to understand Response Time Analysis


the flow of the workload broken out by component

Monitor message counts and rates


10 © 2011 IBM Corporation
If RTA Indicates An Elongation Of Response Time
Look At Transaction Rates And Transaction Queuing
IMS Health workspace focuses on many key rate metrics

Enqueue/dequeue rates CPU rates

Real time indicators at the system


level of transaction rates and queuing

Enqueue/Dequeue rates by category for the system

11 © 2011 IBM Corporation


Further Analysis – Are Transactions Queued?
Drill Down For More Detail

28

From the navigation tree go to


Transaction Summary Look at Transactions by state and
click the link for drill down detail

12 © 2011 IBM Corporation


IMS Dependent Region Display
Understanding Scheduling And Processing Delays
High region occupancy may be an indication of application delays. May result
in higher response time, scheduling delays, and transaction queues.

What transaction, PSB,


and how many calls? How busy is Tran elapsed
the region? Input Queue time

13 © 2011 IBM Corporation


Where Is The Bottleneck?
Use Bottleneck Analysis To Identify Waits By Category
Bottleneck analysis will help identify
workload bottlenecks

Bottleneck analysis does a


detailed analysis of IMS
workload and determines
where the workload is
spending its time. Delay
percentages are broken out
for short term and long term
intervals.

% delay by category

14 © 2011 IBM Corporation


Examples Of Typical IMS Performance And
Availability Challenges
Poor IMS response time, trans queuing and/or bottlenecked
– IMS transactions queued
– IMS scheduling delays
– IMS application performance/system bottlenecks
IMS connection bottlenecks
– CICS/DBCTL connection bottlenecks
– Network delays
– Delays related to IMS Connect, OTMA, APPC, etc.
IMS database and subsystem delays
– IMS database delays
• High I/O, poor BP performance and IMS lock conflicts
External subsystem (DB2) delays – elongate IMS application time
– DB2 thread connection issues
– DB2 SQL delays
– DB2 database I/O delays and BP performance
– DB2 lock conflicts

15 © 2011 IBM Corporation


Monitor IMS Connect Processing
Track Transaction Level Response Time

IMS Connect monitoring provides detailed transaction level response time


information.

Note – Detailed IMS Connect monitoring requires IMS Connect Extensions.


16 © 2011 IBM Corporation
Understanding The Impact Of The Network On
IMS Response Time
OMEGAMON XE For Mainframe Networks

Network time for IMS transactions

Including
network
monitoring
detail
provides a
more
complete
analysis of
IMS
response OMEGAMON XE For IMS
time
IMS host response time including queue
and processing time for the transaction

17 © 2011 IBM Corporation


Examples Of Typical IMS Performance And
Availability Challenges
Poor IMS response time, trans queuing and/or bottlenecked
– IMS transactions queued
– IMS scheduling delays
– IMS application performance/system bottlenecks
IMS connection bottlenecks
– CICS/DBCTL connection bottlenecks
– Network delays
– Delays related to IMS Connect, OTMA, APPC, etc.
IMS database and subsystem delays
– IMS database delays
• High I/O, poor BP performance and IMS lock conflicts
External subsystem (DB2) delays – elongate IMS application time
– DB2 thread connection issues
– DB2 SQL delays
– DB2 database I/O delays and BP performance
– DB2 lock conflicts

18 © 2011 IBM Corporation


IMS I/O Bottlenecks And Contention
Monitor I/O delays and bottlenecks
Database I/O
Monitor BP usage IMS dataset I/O
and hit ratios LGMSG SHMSG I/O

Bottleneck analysis shows I/O delays

Database information (including


HALDB and Fastpath support)

19 © 2011 IBM Corporation


IMS Lock Analysis Information In The Tivoli Portal

More detailed analysis of


lock holders/waiters, and
full support for both IRLM
and PI locking in the TEP

Lock owner/waiters

Drill into application detail

20 © 2011 IBM Corporation


Examples Of Typical IMS Performance And
Availability Challenges
Poor IMS response time, trans queuing and/or bottlenecked
– IMS transactions queued
– IMS scheduling delays
– IMS application performance/system bottlenecks
IMS connection bottlenecks
– CICS/DBCTL connection bottlenecks
– Network delays
– Delays related to IMS Connect, OTMA, APPC, etc.
IMS database and subsystem delays
– IMS database delays
• High I/O, poor BP performance and IMS lock conflicts
External subsystem (DB2) delays – elongate IMS application time
– DB2 thread connection issues
– DB2 SQL delays
– DB2 database I/O delays and BP performance
– DB2 lock conflicts

21 © 2011 IBM Corporation


Where Is The Bottleneck?
Use Bottleneck To Analyze Where The Workload May Be Bottlenecked
GoTo Options Help
------------------------------------------------------------ 10/09/05 13:31:20
KI2PSDX2 Bottlenecks Analysis for Group ATM
Bottleneck Analysis breaks workload
IMSA
into components (for example):
------------------------------------------------------------------------------å
: Elapsed time . . . : 17:24 MN Samples GoTo
taken Options
(short) .Help
: 281 : Using CPU/Waiting for CPU
: Suppress states . . < 0 % Samples ------------------------------------------------------------
taken (long) . : 2026 : 10/09/05 13:31:28
: Display COMPETING TRANSACTIONS + SamplingKI2PSDX2
interval . . . : 5 tenths-sec Scheduling
Bottlenecks: Analysis for GroupWaits
ATM IMSA
------------------------------------------------------------------------------ø
IMS Iwaits
------------------------------------------------------------------------------
Lines 1 to 14 of 29
: Elapsed time . . . : 17:24 MN
----------------------ç---------------------------ç---------------------------å Samples taken (short) . : 281 :
: Wait Reason : Short Term % : Suppress
: states
Long. .Term
< 0% % :
Database
Samples takenWaits
(long) . : 2026 :
: : Display
: % 0------- 50-------100 : %COMPETING 50-------100+ : z/OS
0-------TRANSACTIONS system waits
Sampling interval . . . : 5 tenths-sec :
------------------------------------------------------------------------------
----------------------º-----ç---------------------º-----ç---------------------o
: Using CPU: : 15.0:--> . . . . : 16.2:--> . . . . : Waits for DB2 or15MQ
Lines to 28 of 29
: Using CPU in Appl :10.70:-> . . . -------------------------------------------------------------------------------
. :12.20:-> . . . . :
: Using CPU in IMS : 4.20:> . . .: . :Wait Reason.
4.00:> .Use
: . Bottleneck
Short
. : Term % Analysis
: toLong
determine
Term % :
: Scheduling Waits: : 7.9:> . . .: . : 10.9:-> . .: % . 0-------
. : 50-------100 : % 0------- 50-------100 :
: Wait for MPP : 7.70:> . . . :10.80:-> . .where
. to: look next
. ------------------------------------------------------------------------------
.
: Intent Conflict : .10:> . . .: DC. Sys
: Ckpt
0: Latch
. .: .0: . .: . . . : .20:> . . . . :
: TM Schedule Latch : 0: . . .: Database
. : I/O
0: Waits
. .: .3:>
. . .: . . . : .2:> . . . . :
: IMS Activity: : 10.0:-> . . .: D1SS0005
. : 9.3:> . .: .0: . .: . . . : 0: . . . . :
: Other DL/I IWAIT : 5.60:> . . .: D1B80002
. : 5.50:> . .: .30:>
. . .: . . . : .20:> . . . . :
: IWAIT in IMS Disp : 1.20:> . . .: MVS. Waits:
: 1.20:> . .: 33.2:----->
. . : . . . : 32.0:-----> . . . :
: IWAIT in Term : 0: . . .: CPU. :Wait 0:
(DEP) . .:33.20:----->
. . : . . . :32.00:-----> . . . :
: LOGL Latch : .50:> . . .: Program Fetch I/O
. : .10:> . .: .0: . .: . . . : 0: . . . . :
: DBBP Latch : .10:> . . .: ESS. Waits:
: 0: . .: 26.5:---->
. . : . . . : 23.8:--->. . . . :
: ISWITCHed to CTL : 2.40:> . . .: Commit (Phase 2)
. : 2.10:> . .: 2.80:>
. . .: . . . : 2.30:> . . . . :
: Prepare to Commit : 4.70:> .
-------------------------------------------------------------------------------ø . . . : 5.60:> . . . . :
<Response Time> <Response Time Components> : (Bottlenecks)
User Sign on DB2 : .10:> . . . . : .30:> . . . . :
: Terminate Thread : 0: . . . . : 0: . . . . :
: SQL Call :18.70:--> . . . . :15.30:--> . . . . :
: Other Waits: : : : : :
------------------------------------------------------------------------------ø
<Response Time> <Response Time Components> (Bottlenecks)
External subsystem waits
22 © 2011 IBM Corporation
IMS Historical Performance And Availability Analysis
Categories Of History Data Collection

Interval summary
EPILOG Historical
(with some detail) – Historical analysis of response,
bottlenecks and IMS resources
– Stored in VSAM Epilog Data Store
(EDS) by group and time interval
Detail records
TRF Historical
– Detailed transaction & database data
– individual transactions
– Detailed performance analysis &
chargeback
Recent detail Near Term Historical
– Detail on recent transaction execution
Interval snapshot
Tivoli Enterprise Portal Historical
trending – Tivoli Data Warehouse history
– Use for trending analysis

© 2011 IBM Corporation


Near Term History Of IMS Transactions

Manage near term


history collection

Near term history with drill


down for more detail

24 © 2011 IBM Corporation


Use History To Track And Trend Key IMS Performance
Indicators

Use the Tivoli Portal to collect performance


history data for such things as IMS Bottlenecks,
OTMA, Response time analysis, IMS system
statistics, IMS transaction status

25 © 2011 IBM Corporation


IMS Historical Performance Analysis Workspace
Plot chart analysis of key
IMS performance metrics

Plot charts of
history by
time interval.
Use for trend
analysis. Transactions by status IMS Bottlenecks

Response time and processing rate Enqueue/dequeue rates


26 © 2011 IBM Corporation
Use Chart Functions For Statistical Analysis
Are We Trending The Wrong Way?

Baseline analysis and


arithmetic functions

Area plot charts provide a different perspective of history


27 © 2011 IBM Corporation
Benefits Of An Integrated Alert Management
Methodology
Improved ability to manage increasingly complex composite
applications
– Enables an integrated approach to the management of subsystems,
platforms, and application components
Reduce time to problem resolution
– Identify potential issues more rapidly
Improved event management and problem isolation
– More meaningful and useful problem alerts
Improved event correlation and management
– Eliminate the “noise” and focus on key issues
Superior performance analysis capabilities
– Monitor and manage based upon actual information, not anecdotal data

28 © 2011 IBM Corporation


Alert Example Using The Tivoli Enterprise Portal
To Integrate Essential Performance Information And Manage Alerts
Tivoli Enterprise Portal (The TEP) enables
integrated alert and automation capabilities

IMS as part
of the bigger
picture

IMS is an essential
component of many
mission critical
applications

Performance and Icons indicate an alert


availability
management
requires an
integrated approach

29 © 2011 IBM Corporation


Situations – Usage And Benefits
Highlight Performance And Availability Issues

Flyover pop-up
shows the name of
the ‘situation’ alert
Click to see alert detail

30 © 2011 IBM Corporation


Categories Of Typical Situation Alerts
Application availability
Availability Essential infrastructure availability
Subsystem availability

Types
Of Alerts

Performance Resource
Subsystem performance Subsystem resource utilization
Application performance Application resource utilization
Identification of performance issues

31 © 2011 IBM Corporation


Alert Notification
Types And Options

Visual View – Custom Views – Enterprise View


– Red/Yellow indicators and icons in Tivoli Enterprise Portal or TBSM displays
Console messages
– Example - Issuing messages and commands to the z/OS console
– Use this as a mechanism to feed other automation
Paging and emails
– Issue commands to feed paging systems
– Use 3rd party tools such as Postie to issue emails from the command prompt
– Console messages may be used to feed email systems
SNMP traps and alerts
– Issue SNMP traps from the command prompt using situations or policies
Netcool/OMNIbus events
– OMNIbus acts as an event correlation engine
– May receive events via traps or the EIF interface
Alerts to 3rd party (non-IBM) tools

32 © 2011 IBM Corporation


Application Performance Example
Situations To Monitor Response Time

Using boolean logic allows the


alert to be application sensitive.

A single situation can handle


multiple application groups, if
needed.

Note – this is the


Consider alerting RTA group name
on R0 versus R1
response time.
R0 only
considers Input
Queue and
processing time,
and excludes
outqueue time.
Consider using the persistence
option to filter out outliers

33 © 2011 IBM Corporation


Application Performance Example
Monitoring Transaction Level Queuing

Monitor the queuing and status of


the PART transaction.

If PART is queued or the Queue


depth is beyond a certain level
generate an alert

34 © 2011 IBM Corporation


Subsystem Performance Example
Monitor Dependent Region Processing

Region occupancy measures how


busy the message region is.

Create situations to monitor region


occupancy by region type and/or
region name.

35 © 2011 IBM Corporation


Subsystem Performance Example
Monitoring Queuing At The Subsystem Level

This situation will alert on


transaction queue depth for the
subsystem.

Note – this is a subsystem level


number. For more granular
queue alerts you may use other
situation examples.

36 © 2011 IBM Corporation


Application Availability Example
Alert On Critical Transactions In A Stopped Status

Alerts may be set at the


transaction level for status.

Logic may be added for time of


day and day of week.

Various transaction statuses


that may be alerted on.

37 © 2011 IBM Corporation


Create Situation Alerts When Certain Bottleneck
Analysis Wait Percentages Exceed A Threshold

You may create situation alerts


incorporating IMS wait reasons and
percentages as part of the situation logic

For example:
Alert if DB wait time > n%
Alert if DB2 wait time > n%
Alert if Sched wait > n%

38 © 2011 IBM Corporation


Create An Integrated View Of The Enterprise
Ease Problem Notification/Isolation

CICS
OMEGAMON CICS
Threads

Connection App Init &


bottlenecks Network execution Network
delays delays
IMSConnect
IMS Connect IMS
APPC IMS
APPC Control Message IMS Message Out
OTMA Message In Control
OTMA Region & BMP
Telnet
Telnet Region
Queues & Regions
Scheduling OMEGAMON IMS
OMEGAMON
Mainframe Networks
Threads

Connection DB, BP
bottlenecks I/O delays Lock
Conflicts
Threads IMS
DB2 Subsystem DLI
OMEGAMON DB2 IRLM
DB, BP Lock
I/O delays Conflicts

39 © 2011 IBM Corporation


z/OS Management Console IBM solutions
z/OS Health check that integrate
OMEGAMON XE on z/OS via the Tivoli
z/OS & USS
IBM Tivoli NetView for z/OS V5.3 Enterprise
NetView for z/OS
Portal
OMEGAMON XE for Mainframe Networks
Network
OMEGAMON XE for DB2 PE/PM
DB2
OMEGAMON XE for CICS
CICS
OMEGAMON XE for IMS
IMS
OMEGAMON XE for Storage Tivoli Enterprise Portal
Storage
OMEGAMON XE for Messaging
WebSphere MQ
ITCAM for WAS
WebSphere Appl Server
OMEGAMON XE on z/VM and Linux
z/VM & Linux on z
IBM Tivoli Monitoring (ITM) & ITCAM
Distributed Monitoring
SA for z/OS
Automation
Advanced Audit for DFSMShsm
DFSMS Audit
Advanced Catalog Management for z/OS
Catalog Management
Tivoli Decision Support for z/OS
SMF trend analysis Reports
40 © 2011 IBM Corporation
Use OMEGAMON And The Tivoli Enterprise Portal To
Consolidate Performance Analysis - Example

OMEGAMON
Mainframe Networks
In the integrated
performance
view pull
together detailed
performance Integrated graphic
information for overview
multiple
components

OMEGAMON CICS OMEGAMON DB2


OMEGAMON IMS

41 © 2011 IBM Corporation


Summary
It’s always important to begin with an
understanding of the workload
Have monitoring in place for key resources
Consider History options along with real time
Alerting can be important
Integrated monitoring and management
enables the ‘Big Picture’ view

42 © 2011 IBM Corporation


Check Out My Blog
http://tivoliwithaz.blogspot.com

Visit my blog on IBM Tivoli


performance and availability
management of System z. Lots of
information on OMEGAMON ,
Automation, and many things Tivoli…

43 © 2011 IBM Corporation


Thank You!!

44 © 2011 IBM Corporation

You might also like