DEVANG
DEVANG PATEL INSTITUTEOF
PATEL INSTITUTE OFADVANCE
ADVANCETECHNOLOGY
TECHNOLOGY AND
AND RESEARCH
RESEARCH
Chapter : 7
Subject Faculties: Parallel Database Systems
Prof. Kashyap Patel
Assistant Professor,
Department of Computer Engineering.
Devang Patel Institute of Advance Technology And Research
Charotar University of Science and Technology
Outline:
• Parallel Architectures
• Data Placement
• Parallel Query Processing
• Load Balancing
• Fault-Tolerance
• Database Clusters
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
The Database Problem:
• Large volume of data use disk and large main memory
• I/O bottleneck (or memory access bottleneck)
• Speed(disk) << speed(RAM) << speed(microprocessor)
• Predictions
• Moore’s law: processor speed growth (with multicore): 50 % per year
• DRAM capacity growth : 4 × every three years
• Disk throughput : 2 × in the last ten years
• Conclusion : the I/O bottleneck worsens
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
The Solution:
• Increase the I/O bandwidth
• Data partitioning
• Parallel data access
• Origins (1980's): database machines
• Hardware-oriented bad cost-performance failure
• Notable exception : ICL's CAFS Intelligent Search Processor
• 1990's: same solution but using standard hardware components integrated in a
multiprocessor
• Software-oriented
• Standard essential to exploit continuing technology improvements
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Data Server Architecture:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Objectives of Data Servers:
• Avoid the shortcomings of the traditional DBMS approach
• Centralization of data and application management
• General-purpose OS (not DB-oriented)
• By separating the functions between
• Application server (or host computer)
• Data server (or database computer or back-end computer)
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel Data Processing:
• Three ways of exploiting high-performance multiprocessor systems:
Automatically detect parallelism in sequential programs (e.g., Fortran, OPS5)
Augment an existing language with parallel constructs (e.g., C*, Fortran90)
Offer a new language in which parallelism can be expressed or automatically inferred
• Critique
Hard to develop parallelizing compilers, limited resulting speed-up
Enables the programmer to express parallel computations but too low-level
Can combine the advantages of both (1) and (2)
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Data-based Parallelism:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel DBMS:
• Loose definition: a DBMS implemented on a tighly coupled multiprocessor
• Alternative extremes
• Straighforward porting of relational DBMS (the software vendor edge)
• New hardware/software combination (the computer manufacturer edge)
• Naturally extends to distributed databases with one server per site
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel DBMS – Objectives:
• Much better cost / performance than mainframe solution
• High-performance through parallelism
• High throughput with inter-query parallelism
• Low response time with intra-operation parallelism
• High availability and reliability by exploiting data replication
• Extensibility with the ideal goals
• Linear speed-up
• Linear scale-up
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel DBMS – Functional Architecture :
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel DBMS Functions:
• Session manager
• Host interface
• Transaction monitoring for OLTP
• Request manager
• Compilation and optimization
• Data directory management
• Semantic data control
• Execution control
• Data manager
• Execution of DB operations
• Transaction management support
• Data management
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel System Architectures:
• Multiprocessor architecture alternatives
• Shared memory (SM)
• Shared disk (SD)
• Shared nothing (SN)
• Hybrid architectures
• Non-Uniform Memory Architecture (NUMA)
• Cluster
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Shared-Memory:
DBMS on symmetric multiprocessors (SMP)
Prototypes: XPRS, Volcano, DBS3
+ Simplicity, load balancing, fast communication
- Network cost, low extensibility
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Shared-Disk:
Origins : DEC's VAXcluster, IBM's IMS/VS Data Sharing
Used first by Oracle with its Distributed Lock Manager
Now used by most DBMS vendors
+ network cost, extensibility, migration from uniprocessor
- complexity, potential performance problem for cache coherency
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Shared-Nothing:
Used by Teradata, IBM, Sybase, Microsoft for OLAP
Prototypes: Gamma, Bubba, Grace, Prisma, EDS
+ Extensibility, availability
- Complexity, difficult load balancing
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Hybrid Architectures:
• Various possible combinations of the three basic architectures are possible to obtain
different trade-offs between cost, performance, extensibility, availability, etc.
• Hybrid architectures try to obtain the advantages of different architectures:
• efficiency and simplicity of shared-memory
• extensibility and cost of either shared disk or shared nothing
• 2 main kinds: NUMA and cluster
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
NUMA:
• Shared-Memory vs. Distributed Memory
• Mixes two different aspects : addressing and memory
• Addressing: single address space vs multiple address spaces
• Physical memory: central vs distributed
• NUMA = single address space on distributed physical memory
• Eases application portability
• Extensibility
• The most successful NUMA is Cache Coherent NUMA (CC-NUMA)
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
CC-NUMA:
• Principle
• Main memory distributed as with shared-nothing
• However, any processor has access to all other processors’ memories
• Similar to shared-disk, different processors can access the same data in a conflicting
update mode, so global cache consistency protocols are needed.
• Cache consistency done in hardware through a special consistent cache interconnect
• Remote memory access very efficient, only a few times (typically between 2 and 3 times) the cost
of local access
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Cluster:
• Combines good load balancing of SM with extensibility of SN
• Server nodes: off-the-shelf components
• From simple PC components to more powerful SMP
• Yields the best cost/performance ratio
• In its cheapest form,
• Fast standard interconnect (e.g., Myrinet and Infiniband) with high bandwidth
(Gigabits/sec) and low latency
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
SN cluster vs SD cluster:
• SN cluster can yield best cost/performance and extensibility
• But adding or replacing cluster nodes requires disk and data reorganization
• SD cluster avoids such reorganization but requires disks to be globally accessible by the
cluster nodes
• Network-attached storage (NAS)
• distributed file system protocol such as NFS, relatively slow and not appropriate for database
management
• Storage-area network (SAN)
• Block-based protocol thus making it easier to manage cache consistency, efficient, but costlier
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel DBMS Techniques:
• Data placement
• Physical placement of the DB onto multiple nodes
• Static vs. Dynamic
• Parallel data processing
• Select is easy
• Join (and all other non-select operations) is more difficult
• Parallel query optimization
• Choice of the best parallel execution plans
• Automatic parallelization of the queries and load balancing
• Transaction management
• Similar to distributed transaction management
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Data Partitioning:
• Each relation is divided in n partitions (subrelations), where n is a function of relation size
and access frequency
• Implementation
• Round-robin
• Maps i-th element to node i mod n
• Simple but only exact-match queries
• B-tree index
• Supports range queries but large index
• Hash function
• Only exact-match queries but small index
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Partitioning Schemes:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Placement Directory:
• Performs two functions
• F1 (relname, placement attval) = lognode-id
• F2 (lognode-id) = phynode-id
• In either case, the data structure for f1 and f2 should be available when needed at each
node
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Join Processing:
• Three basic algorithms for intra-operator parallelism
• Parallel nested loop join: no special assumption
• Parallel associative join: one relation is declustered on join attribute and equi-join
• Parallel hash join: equi-join
• They also apply to other complex operators such as duplicate elimination, union,
intersection, etc. with minor adaptation
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel Nested Loop Join:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel Associative Join:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel Hash Join:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Parallel Query Optimization:
• The objective is to select the “best” parallel execution plan for a query using the following
components
• Search space
• Models alternative execution plans as operator trees
• Left-deep vs. Right-deep vs. Bushy trees
• Search strategy
• Dynamic programming for small search space
• Randomized for large search space
• Cost model (abstraction of execution system)
• Physical schema info. (partitioning, indexes, etc.)
• Statistics and cost functions
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Load Balancing:
• Problems arise for intra-operator parallelism with skewed data distributions
• attribute data skew (AVS)
• tuple placement skew (TPS)
• selectivity skew (SS)
• redistribution skew (RS)
• join product skew (JPS)
• Solutions
• sophisticated parallel algorithms that deal with skew
• dynamic processor allocation (at execution time)
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Load Balancing in a DB Cluster:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
The Exadata Database Machine:
• New machine from Oracle with Sun
• Objectives
• OLTP, OLAP, mixed workloads
• Oracle Real Application Cluster
• 8+ servers bi-pro Xeon, 72 GB RAM
• Exadata storage server : intelligent cache
• 14+ cells, each with
• 2 processors, 24 Go RAM
• 385 GB of Flash memory (read is 10* faster than disk)
• 12+ SATA disks of 2 To or 12 SAS disks of 600 GB
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Exadata Architecture:
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH
Thank you
DEVANG PATEL INSTITUTE OF ADVANCE TECHNOLOGY AND RESEARCH