Introduction to Orthogonal Range Searching
Orthogonal range searching is an extension of one-dimensional range searching to multiple
dimensions. It is useful for geometric applications and database indexing. The problem
involves querying a dataset for points that fall within a given d-dimensional range.
For example, a five-dimensional orthogonal range query could be:
“List all employees with salary between $50,000 and $75,000, age above 50, who
made more than $500,000 in sales in each of the last three years.”
Orthogonal range searching is often used for preprocessing queries to isolate relevant data
points before answering the query efficiently.
Canonical Interval Decomposition and Recursive Approach
To solve this efficiently, we use orthogonal range trees, which recursively decompose the
query interval into a hierarchical structure.
1. Recursive Structure:
o A d-dimensional range tree consists of nodes that store information based on
the first coordinate.
o Each node in the tree stores a secondary (d-1)-dimensional tree for the
remaining coordinates.
2. Query Execution:
o First, perform a 1D range search on the first coordinate, which takes O(logn).
o For each retrieved node, perform a (d-1)-dimensional range search
recursively.
o The final result is obtained by collecting points from all relevant subproblems.
Orthogonal range trees are a hierarchical data structure that enables efficient range
searching in multi-dimensional space.
They use recursive decomposition, with each tree node containing a secondary
range tree for the remaining dimensions.
Query time: O((logn)d+k) (output-sensitive).
Space complexity: O(n(logn)d−1).
This method is widely used for database indexing, geometric applications, and
multidimensional data queries.
Step 1: Understanding the Input
We are given a set of 9 points in 2D space.
Each point has two coordinates (x,y)(x, y)(x,y).
For example, from the image, the points appear to be:
(1,1),(1,5),(2,8),(3,3),(3,5),(4,6),(6,4),(7,6),(8,7),(7,9)
Our goal is to build a 2D orthogonal range tree, which consists of:
1. Primary Search Tree on X-coordinate (1D Range Tree)
2. Secondary Search Trees (Y-trees) for Each Node
Step 2: Constructing the Primary Search Tree on X-Coordinates
1. Sort the points by the first coordinate (X-values):
(1,1),(1,5),(2,8),(3,3),(3,5),(4,6),(6,4),(7,6),(7,9),(8,7)
Build a balanced BST on X-values (Median-based splitting):
o Choose the median X-value as the root.
o Left subtree contains points with smaller X-values.
o Right subtree contains points with larger X-values.
From the image, the X-tree structure is:
(6)
/ \
(3) (8)
/ \ /
(1) (5) (7)
/\ / \
(0) (2) (4) (7)
Each node in this tree stores all points in its subtree.
Step 3: Constructing the Secondary Search Trees (Y-Trees)
Each node in the X-tree stores a Y-tree, which is a BST of all points in its subtree sorted by
Y-values.
Example: Y-Tree at the Root (6)
The root (6) has all points in its subtree:
(1,1),(1,5),(2,8),(3,3),(3,5),(4,6),(6,4),(7,6),(7,9),(8,7)
Sorting by Y-values:
(1,1),(3,3),(6,4),(3,5),(1,5),(4,6),(7,6),(8,7),(2,8),(7,9)(1,1), (3,3), (6,4), (3,5), (1,5), (4,6),
(7,6), (8,7), (2,8), (7,9)(1,1),(3,3),(6,4),(3,5),(1,5),(4,6),(7,6),(8,7),(2,8),(7,9)
Constructing a balanced Y-tree:
(5)
/ \
(3) (7)
/ \ / \
(1) (4) (6) (8)
\
(9)
This means that for each node in the X-tree, we can quickly search for Y-values within a
range.
Step 4: Performing a Range Query
Now that the X-tree and Y-trees are built, we can perform range queries efficiently.
Example Query: Find all points in the range
[3≤x≤7]AND[4≤y≤7]
Step 1: Search in X-Tree
Traverse the X-tree to find all nodes covering 3≤x≤7
Relevant nodes: (3), (5), (6), (7).
Step 2: Search in Y-Trees
At each relevant X-node, search its Y-tree for points within 4≤y≤7
Extract results.
Final Query Complexity
Searching in X-tree: O(logn)
Searching in each Y-tree: O(logn)
Total query time: O(\log^2 n + k), where kkk is the number of points found.
Final Summary
1. Sort points by X-coordinates.
2. Build a balanced BST on X-values (Primary Tree).
3. For each node in the X-tree, build a Y-tree (Secondary Search Structure).
4. Perform range queries efficiently using both X and Y trees.
First, we sort them by their X-coordinates:
(1,1),(1,5),(2,8),(3,3),(3,5),(4,6),(6,4),(7,6),(7,9),(8,7)(1,1), (1,5), (2,8), (3,3), (3,5), (4,6),
(6,4), (7,6), (7,9), (8,7)(1,1),(1,5),(2,8),(3,3),(3,5),(4,6),(6,4),(7,6),(7,9),(8,7)
Step 2: Choose the Median as the Root
To ensure the tree remains balanced, we select the median of the sorted list as the root node.
The sorted list has 10 points (even count).
The median is the 6th element in the sorted list (0-based index 5).
The median X-coordinate is 6, so 6 is chosen as the root.
markdown
CopyEdit
(6)
Step 3: Recursively Build the Left and Right Subtrees
Now, we divide the remaining points into two halves:
Left subtree: Points with X<6X < 6X<6
Right subtree: Points with X>6X > 6X>6
Left Subtree Construction
Points with X<6X < 6X<6:
(1,1),(1,5),(2,8),(3,3),(3,5),(4,6)(1,1), (1,5), (2,8), (3,3), (3,5), (4,6)(1,1),(1,5),(2,8),(3,3),(3,5),
(4,6)
Median = 3 (middle of the sorted left half)
Root of left subtree = 3
markdown
CopyEdit
(6)
/
(3)
Now, split again:
Left of 3: (1,1),(1,5),(2,8)(1,1), (1,5), (2,8)(1,1),(1,5),(2,8) → Median = 1 (Root)
Right of 3: (3,5),(4,6)(3,5), (4,6)(3,5),(4,6) → Median = 5 (Root)
markdown
CopyEdit
(6)
/
(3)
/ \
(1) (5)
Right Subtree Construction
Points with X>6X > 6X>6:
(7,6),(7,9),(8,7)(7,6), (7,9), (8,7)(7,6),(7,9),(8,7)
Median = 7 (middle of the sorted right half)
Root of right subtree = 7
markdown
CopyEdit
(6)
/ \
(3) (7)
Now, split again:
Left of 7: (7,6)(7,6)(7,6) → No split needed.
Right of 7: (8,7)(8,7)(8,7) → No split needed.
Final X-Tree Structure
markdown
CopyEdit
(6)
/ \
(3) (7)
/ \ \
(1) (5) (8)
/\
(0) (2)
The tree is balanced.
Each node divides the X-values into two halves.
Each node stores all points in its subtree (used for Y-tree construction).
How Did We Choose 6 as the Root?
1. Sort all X-coordinates.
2. Pick the median X-value as the root to balance the tree.
3. Recursively divide left and right halves using the same median-based approach.
This method ensures the tree height is O(logn)O(\log n)O(logn), making range queries
efficient.