Database Management – DRAT-S Exam
Notes
1 ER Model (Entity-Relationship Model)
1️⃣
The ER Model is a high-level data model used in the conceptual design phase of databases.
Key Concepts:
- Entity: A real-world object (e.g., Student, Course)
- Attribute: Property of an entity (e.g., Student_Name)
- Entity Set: Collection of similar entities (e.g., all Students)
- Relationship: Association between entities (e.g., Enrolls)
- Cardinality: 1:1, 1:N, M:N relationships
Example Diagram:
[STUDENT]──(ENROLLS)──[COURSE]
| |
[Student_ID] [Course_ID]
2️⃣Relational Model
Represents data using tables (relations).
Key Concepts:
- Relation = Table
- Tuple = Row
- Attribute = Column
- Primary Key: Uniquely identifies tuples
- Foreign Key: References primary key of another relation
Example:
STUDENT(Student_ID PK, Name, Dept)
COURSE(Course_ID PK, Title)
ENROLLS(Student_ID FK, Course_ID FK)
3️⃣SQL (Structured Query Language)
DDL (Data Definition Language):
CREATE TABLE Student (
Student_ID INT PRIMARY KEY,
Name VARCHAR(50)
);
DML (Data Manipulation Language):
INSERT INTO Student VALUES (101, 'John');
SELECT * FROM Student;
UPDATE Student SET Name='Jane' WHERE Student_ID=101;
JOIN Types:
- INNER JOIN
- LEFT/RIGHT JOIN
- FULL OUTER JOIN
4️⃣Integrity Constraints
Rules to ensure accuracy and consistency of data.
Types:
- Primary Key Constraint
- Foreign Key Constraint
- NOT NULL
- UNIQUE
- CHECK (condition)
Example:
CREATE TABLE Orders (
Order_ID INT PRIMARY KEY,
Quantity INT CHECK (Quantity > 0)
);
5️⃣Indexing
Improves the speed of data retrieval operations.
Types:
- Primary Index: Based on primary key
- Secondary Index: On non-key fields
- Clustered Index: Sorts the table itself
- Non-clustered Index: Creates a separate structure
Example:
CREATE INDEX idx_student_name ON Student(Name);
6️⃣Data Transformation
Used in data preprocessing for machine learning and analytics.
Includes:
- Normalization
- Sampling
- Compression
📊 Normalization
Reduces redundancy and dependency by organizing fields and table relations.
Forms:
- 1NF: Atomic values
- 2NF: No partial dependency
- 3NF: No transitive dependency
Example:
Split table with repeating groups into separate related tables.
🧪 Sampling
Selects a subset of data for faster processing.
Types:
- Random Sampling
- Stratified Sampling
- Systematic Sampling
Used in big datasets where full analysis is resource-heavy.
🔽 Compression
Reduces storage space and I/O time.
Techniques:
- Run-Length Encoding (RLE)
- Dictionary Encoding
- Delta Encoding
Trade-off: Space vs computation time.