Unit IV: Database Management Systems
1. Introduction to DBMS
- A Database Management System (DBMS) is software for storing, retrieving, and managing data in
databases. - File System Organizations: Sequential, Pointer, Indexed, Direct. - Purpose: Manage
large amounts of data, reduce redundancy, maintain data integrity. - Data Models: Hierarchical,
Network, Relational, Object-oriented. - Components: DB Engine, DB Schema, Query Processor,
Storage Manager. - Relational Algebra: Set of operations to manipulate relations (Select, Project,
Join, Union, etc.).
2. Logical Database Design
- Relational DBMS follows Codd’s 12 rules. - ER Model: Represents entities and relationships. -
Extended ER: Adds generalization, specialization, aggregation. - Functional Dependency (FD):
Relation between attributes. - Trivial FD: RHS is a subset of LHS. - Transitive FD: A → B and B →
C implies A → C. - Partial FD: Attribute depends only on part of the primary key. - Multivalued FD:
Attribute depends on whole key, independently of other attributes. - Normal Forms: * 1NF – Atomic
values * 2NF – No partial dependency * 3NF – No transitive dependency * BCNF – Each
determinant is a candidate key * 4NF – No multivalued dependency * 5NF – No join dependency -
Domain Key Normal Form: Every constraint is a logical consequence of domain and key
constraints. - Denormalization: Process of combining tables to improve performance.
3. SQL & Query Optimization
- SQL Standards: ANSI SQL. - Data Types: INT, VARCHAR, DATE, FLOAT, etc. - Database
Objects: Tables, Views, Indexes, Sequences. - Commands: * DDL – CREATE, ALTER, DROP *
DML – SELECT, INSERT, UPDATE, DELETE * DCL – GRANT, REVOKE * TCL – COMMIT,
ROLLBACK, SAVEPOINT - Embedded SQL: SQL inside programming languages. - Static SQL:
Queries known at compile time. - Dynamic SQL: Queries constructed at runtime. - Query
Optimization: Improves performance of queries. - Heuristics: Rule-based (e.g., push selection
before join). - Cost-Based: Chooses plan with lowest estimated cost.
4. Transaction Processing and Concurrency Control
- Transaction: A unit of work performed on DB. - Properties: ACID (Atomicity, Consistency,
Isolation, Durability). - Serializability: Ensures correctness in concurrent transactions. - Concurrency
Control: Maintains isolation. - Locking: * Shared and Exclusive Locks * Two-Phase Locking (2PL):
Growing and shrinking phase - Two-Phase Commit Protocol: Ensures atomic distributed
transactions. - Deadlock: Occurs when two transactions wait for each other’s lock.
5. Trends in Database Technology
- RAID: Uses multiple disks for redundancy and speed (RAID 0, 1, 5). - File Organization: * Heap:
Unordered * Sequential: Sorted * Hash: Uses hash function * Clustered: Related data together -
Record Structures: Fixed-length vs Variable-length. - Indexing: * Primary/Secondary, Dense/Sparse
- Ordered Indices: Useful for range queries. - B Tree: Keys in both internal and leaf nodes. - B+
Tree: Keys and data only in leaf nodes, preferred in DBMS. - Hashing: * Static: Fixed buckets *
Dynamic: Directory grows with data - OODBMS: Supports objects, encapsulation, inheritance. -
ORDBMS: Hybrid of RDBMS and OODBMS, supports UDTs and complex types.