Optimizing Query Performance: Hanoi University of Technology
Optimizing Query Performance: Hanoi University of Technology
Optimizing Query Performance: Hanoi University of Technology
Query Performance
Vu Tuyet Trinh
trinhvt-fit@mail.hut.edu.vn
Hanoi University of Technology
1
Overview of Query Processing
Transact-SQL Parsing
Transact-SQL ParsingProcess
Process
Standardization
StandardizationProcess
Process
Query
QueryOptimization
Optimization
Compilation
Compilation
Database Results
Results
DatabaseAccess
AccessRoutines
Routines Set
Set
Microsoft
Outline
Microsoft
Query Optimizer Issues
Function
Cost-Based Optimization
How It Works
Phases
Caching the Execution Plan
Setting a Cost Limit
Microsoft
Cost-Based Optimization
Microsoft
Query Optimization Phases
Query Analysis
Identifies the search and join criteria of the query. By limiting
the search, the query optimizer minimizes the number of rows
that are
processed
Index Selection
Determines whether an index or indexes exist. Then, there is
an assessment of the usefulness of the index or indexes.
Usefulness of an index is determined by how many rows will
be returned
Assesses the usefulness of the index or indexes
Join Selection
Evaluates which join strategy to use by considering a number
of factors: selectivity, density, memory required to process the
query.
Microsoft
Caching the Execution Plan
Storing a Execution Plan in Memory
One copy for all serial executions
Another copy for all parallel executions
Using an Execution Context
An existing execution plan is reused, if one exists
A new execution plan is generated, if one does not exist
Recompiling Execution Plans
Changes can cause execution plan to be inefficient or invalid
For example, a large number of new rows added
ALTER TABLE/VIEW
UPDATE STATISTICS
Dropping an INDEX that is used
Explicit sp_recompile
Microsoft
Setting a Cost Limit
Microsoft
Obtaining Execution Plan Information
Microsoft
STATISTICS Statements Output
Statement
Statement Output
OutputSample
Sample
STATISTICS
STATISTICS SQL
SQL Server
Server Execution
Execution Times:
Times:
TIME
TIME CPU
CPU time = 0 ms, elapsed time
time = 0 ms, elapsed time == 22 ms.
ms.
STATISTICS
STATISTICS Rows
Rows Executes
Executes StmtText
StmtText StmtId…
StmtId…
PROFILE
PROFILE -----------------------------------------------
-----------------------------------------------
47
47 11 SELECT
SELECT ** FROM
FROM [charge]
[charge] 16
16
WHERE
WHERE (([charge_amt]>=@1)
(([charge_amt]>=@1)
..
..
..
STATISTICS
STATISTICSIO
IO Table
Table 'member'.
'member'. Scan
Scan count
count 1,
1,
logical
logical reads
reads 23,
23, physical
physical reads
reads 0,
0,
read-ahead reads
read-ahead reads 0.0.
Microsoft
STATISTICS TIME obtains information about
the number of milliseconds required to parse,
compile, and execute each statement.
Microsoft
SHOWPLAN_ALL and
SHOWPLAN_TEXT Output
Structure of the SHOWPLAN Statement Output
Returns information as a set of rows
Forms a hierarchical tree
Represents steps taken by the query optimizer
Shows estimated values of how a query was optimized, not the actual
execution plan
Details of the Execution Steps
* Which indexes are used with which tables
* The join order of the tables
* The chosen update mode
* Worktables and other strategies
Explore:
What is the difference Between SHOWPLAN_TEXT and
SHOWPLAN_ALL Output
Microsoft
Graphically Viewing the Execution Plan
Microsoft
Elements of the Graphical Execution
Plan
Steps Are Units of Work to Process a Query
Sequence of Steps Is the Order in Which the Steps Are
Processed
Logical Operators Describe Relational Algebraic
Operation Used to Process a Statement
Physical Operators Describe Physical Implementation
Algorithm Used to Process a Statement
Microsoft
A partial list of physical operators
Filter Sort
Hash Match
Table Scan
Index Scan
Merge Join
Index Seek
Microsoft
Reading Graphical Execution Plan Output
Query Plan Sequence
Sequence of
of Steps
Steps
Member.corp_no
Cost 9%
Argument:
OBJECT: ([credit].[dbo].[member].[fname]),
SEEK: ([member],[firstname] >=‘Rb’ AND
[member],[firstname] <‘T’) ORDERED
Microsoft
Using the Bookmark Lookup Operation
Analyzing the Query Plan
Typically used after all steps have been processed
Retrieving Rows
Row identifiers to find the corresponding row in a
heap.
Clustering Keys to find the corresponding row in a
clustered index
Observing the Details
A bookmark label used to find the row in the table or
clustered index
The table name or clustered index name from which the
row is found
The WITH PREFETCH clause, if the query optimizer
determines that readahead is the best way to find
bookmarks in the table or clustered index
Determining When the Bookmark Lookup Operator is Used
Queries containing the IN clause or the OR operator
Microsoft
Using an Index to Cover a Query
Microsoft
Introduction to Indexes That Cover a
Query
Indexes That Cover Queries Retrieve Data Quickly
Only Nonclustered Indexes Cover Queries
Indexes Must Contain All Columns Referenced in the
Query
No Data Page Access Is Required
Indexed Views Can Pre-Aggregate Data
Microsoft
Locating Data by Using Indexes That
Cover a Query
Example of Single Page Navigation
Example of Partial Scan Navigation
Example of Full Scan Navigation
Microsoft
Example of Single Page Navigation
SELECT
SELECT lastname,
lastname, firstname
firstname
FROM member
FROM member
WHERE
WHERE lastname
lastname == 'Hall'
'Hall'
Index Pages Akhtar Sarah
Lang Eric
… … Non-Leaf
… … Level
Akhtar Sarah Lang Eric
… … … …
Ganio Jon … …
… … … …
Leaf Level
(Key Value)
Akhtar Sarah Ganio Jon Lang Eric
Barr … Hall Don Martin …
Barr … … Hart Sherri … Martin … …
Borm … Jones Amy Martin …
Buhl … Jones Beverly Moris …
Data Pages
Microsoft
SQL Server goes through the following steps to
retrieve the information:
1. Traverses the index tree comparing the last name
Hall to the key values.
2. Continues to traverse the index until it reaches the
first page of the leaf level containing the key value
Hall.
3. Returns the qualifying rows without accessing the
data pages, because the lastname and firstname key
values are contained in the leaf level.
Microsoft
Example of Partial Scan Navigation
USE
USE credit
credit
SELECT
SELECT lastname,
lastname, firstname
firstname
FROM member
FROM member
Index Pages Akhtar … WHERE lastname BETWEEN 'Funk' AND 'Lang'
WHERE lastname BETWEEN 'Funk' AND 'Lang'
Jordan … Non-Leaf
… … Level
Akhtar … Jordan …
Chai … Lang …
Dunn … Morgan …
Ganio … Smith …
Leaf Level
(Key Value)
Akhtar … Chai … Dunn … Ganio … Jordan … Lang … Morgan … Smith …
Barr … Con … Dunn … Hall … Kim … Martin … Nash … Smith …
Barr … Con … Fine … Hart … Kim … Martin … Nay … Smith …
Borm … Cox … Fort … Jones … Koch … Martin … Ota … Smith …
Buhl … Dale … Funk … Jones … Koch … Moris … Rudd … Smith …
Data Pages
Microsoft
SQL Server goes through the following steps to
retrieve the information:
1. Traverses the index tree.
2. Starts reading leaf-level pages at the page that
contains the first occurrence of the last name
Funk. Data in the leaf level is sorted in
ascending order.
3. Reads the range of leaf-level pages through to
the last name of Lang. At this time, the partial
scan is completed.
4. Returns the qualifying rows without accessing
the data pages, because the leaf level is
scanned for last names between Funk and Lang
Microsoft
Example of Full Scan Navigation
USE
USE credit
credit
SELECT
SELECT lastname,
lastname, firstname
firstname
FROM member
FROM member
Index Pages Akhtar
…
Martin Non-Leaf
Level
Akhtar Martin
Ganio Smith
… …
Leaf Level
(Key Value)
Akhtar … Chai … Dunn … Ganio … Jordan … Lang … Morgan … Smith …
Barr … Con … Dunn … Hall … Kim … Martin … Nash … Smith …
Barr … Con … Fine … Hart … … Kim … Martin … Nay … Smith …
Borm … Cox … Fort … Jones … Koch … Martin … Ota … Smith …
Buhl … Dale … Funk … Jones … Koch … Moris … Rudd … Smith …
Data Pages
Microsoft
To retrieve the information, SQL Server:
1. Traverses the index tree.
2. Reads the leaf-level pages, starting with the
first page, and scans through all of the leaf-
level pages until it reaches the last page in the
leaf-level.
3. Returns the qualifying rows without
accessing the data pages because the leaf-
level is scanned.
Microsoft
Identifying Whether an Index Can Be
Used to Cover a Query
All Necessary Data Must Be in the Index
A Composite Index Is Useful Even if the First Column Is
Not Referenced
A WHERE Is Not Necessary
A Nonclustered Index Can Be Used if It Requires Less
I/O Than a Clustered Index Containing a Column
Referenced in the WHERE Clause
Indexes Can Be Joined to Cover a Query
Microsoft
Determining Whether an Index Is Used
to Cover a Query
Observing the Execution Plan Output
Displays the phrase “Scanning a non-clustered index
entirely or only a range”
Comparing I/O
Nonclustered index
Total number of levels in the non–leaf level
Total number of pages that make up the leaf
level
Total number of rows per leaf-level page
Microsoft
Guidelines for Creating Indexes That
Cover a Query
Add Columns to Indexes. You may want to add columns to some
indexes that:
• Cover more than one query.
• Contribute toward covering some of your more common queries.
• Are referenced frequently.
• Do not significantly add to the key size.
Minimize Index Key Size, avoid specifying key values that are too
wide
Maintain Row-to-Key Size Ratio. If the size of the index key
increases relative to the row size, query performance may be
affected
Microsoft
Indexing Strategies
Microsoft
Evaluating I/O for Queries That Access
a Range of Data
Assume the following when comparing the
different methods:
There are 1 million rows, and 96 rows per page.
The total number of pages is 10,147.
There is no clustered index.
100,000 rows fall within the $20.00 to $30.00
range.
367 index rows fit on a nonclustered index
leaf page.
Microsoft
SELECT
SELECT charge_no
charge_no
FROM
FROM charge
charge
WHERE
WHERE charge_amt
charge_amt BETWEEN
BETWEEN 20
20 AND
AND 30
30
Access
Accessmethod
method Page
PageI/O
I/O
Table
Tablescan
scan 10,417
10,417
Clustered
Clusteredindex
indexon
onthe
thecharge_amt
charge_amtcolumn
column 1042
1042
Nonclustered
Nonclustered index
index on the charge_amt column 100,273
Each data page ison
readthe charge_amt
multiple times column 100,273
Composite
Compositeindex
indexon oncharge_amt,
charge_amt,charge_no
charge_no 273
273
columns
columns Covering Query
Microsoft
Indexing for Multiple Queries
Example
Example 11
USE
USE credit
credit
SELECT
SELECT charge_no,
charge_no, charge_dt,
charge_dt, charge_amt
charge_amt
FROM
FROM charge
charge
WHERE
WHERE statement_no
statement_no == 19000
19000 AND
AND member_no
member_no == 3852
3852
Example
Example 22
USE
USE credit
credit
SELECT
SELECT member_no,
member_no, charge_no,
charge_no, charge_amt
charge_amt
FROM
FROM charge
charge
WHERE
WHERE charge_dt
charge_dt between
between '07/30/1999'
'07/30/1999'
AND
AND '07/31/1999'
'07/31/1999' AND
AND member_no
member_no == 9331
9331
Microsoft
Compares the query performance of
Examples 1 and 2
Microsoft
Guidelines for Creating Indexes
Determine the Priorities of All of the Queries
Determine the Selectivity for Each Portion of the WHERE Clause of
Each Query
Determine Whether to Create an Index
Based on priority, selectivity, column width
Identify the Columns That Should Be Indexed
Determine the Best Column Order of Composite Indexes
Determine What Other Indexes Are Necessary
Test the Performance of the Queries
SET SHOWPLAN ON SET STATISCTICS IO ON SET
STATISTICS TIME ON
Microsoft
Overriding the Query Optimizer
Microsoft
Determining When to Override the
Query Optimizer
Limit Optimizer Hints
Leads Optimizer in a certain direction
Use only if Optimizer is not doing a good job
Explore Other Alternatives Before Overriding the Query
Optimizer by:
Updating statistics
Recompiling stored procedures
Reviewing the queries or search arguments
Evaluating the possibility of building different indexes
Microsoft
Using Hints and SET FORCEPLAN
Statement
Table Hints
Forces use of an Index
Each table hint can be specified only once, although you can
have multiple table hints
The WITH clause must be specified next to the table name
Join Hints
Forces what time of JOIN to use. E.g., MERGE-JOIN
Microsoft
Query Hints
Forces a query to use a particular aspect of the plan
Each query hint can be specified only once, although
you can have multiple query hints.
The OPTION clause must be specified with the
outermost query of the statement.
The query hint affects all operators in the statement.
If a UNION is involved in the main query, only the last
query involving a UNION operator can have the
OPTION clause.
SET FORCEPLAN Statement
Microsoft
Confirming Query Performance After
Overriding the Query Optimizer
Verify That Performance Improves
Document Reasons for Using Optimizer Hints
Retest Queries Regularly
Microsoft
Recommended Practices
Use
Use the
the Query
Query Governor
Governor to
to Prevent
Prevent Long-Running
Long-Running Queries
Queries from
from
Consuming
Consuming System
System Resources
Resources
Have
Have aa Thorough
Thorough Understanding
Understanding of
of the
the Data
Data and
and How
How Queries
Queries
Gain
Gain Access
Access to
to Data
Data
Create
Create Indexes
Indexes That
That Cover
Cover the
the Most
Most Frequently
Frequently Used
Used Queries
Queries
Establish
Establish Indexing
Indexing Strategies
Strategies for
for Individual
Individual and
and Multiple
Multiple Queries
Queries
Avoid
Avoid Overriding
Overriding the
the Query
Query Optimizer
Optimizer
Microsoft
Database Tuning Advisor
Microsoft
Workload and Results
.trc file
.sql script Table
Database
Tuning
Advisor
Databases
Microsoft
Demo
Microsoft
Database Maintenance Plans
Microsoft
Database Maintenance Plans
Microsoft
Database Maintenance Plans
Microsoft
Database Maintenance Workflow
Microsoft
Dynamic Management Views
Microsoft
SQL Profiler
Debug statements
Analyze performance
Stress testing
Microsoft
Monitoring Tools
Microsoft
Microsoft
Summary
Microsoft 53