[go: up one dir, main page]

0% found this document useful (0 votes)
7 views24 pages

DBMS Course Project

The document outlines various projects related to Database Management Systems (DBMS), including data migration, query optimization, relational synthesis, data integration, query synthesis from natural language, query simulation, and transaction management. Each project includes objectives, steps, challenges, and examples, emphasizing the importance of data integrity, performance improvement, and user accessibility. The document serves as a comprehensive guide for implementing and understanding key DBMS concepts and practices.

Uploaded by

tanvik652
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views24 pages

DBMS Course Project

The document outlines various projects related to Database Management Systems (DBMS), including data migration, query optimization, relational synthesis, data integration, query synthesis from natural language, query simulation, and transaction management. Each project includes objectives, steps, challenges, and examples, emphasizing the importance of data integrity, performance improvement, and user accessibility. The document serves as a comprehensive guide for implementing and understanding key DBMS concepts and practices.

Uploaded by

tanvik652
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

DBMS Course Project

Dr.Ranjeetsingh Suryawanshi
Assistant Professor
Vishwakarma Institute of Technology, Pune
Data Migration 1
• Data Migration Projects in DBMS involve transferring data between
different storage systems, formats, or computer systems.
• Data migration is the process of transferring data from one system to
another while ensuring data integrity, quality, security, and minimal
downtime.
• Types of Data Migration Projects
• Database Migration
• Moving data from one database to another (e.g., Oracle → PostgreSQL).
• Storage Migration
• Moving data from one physical storage to another (e.g., HDD to SSD or cloud
storage).
• Application Migration
• Moving data between different application platforms (e.g., CRM A to CRM B).
• Cloud Migration
• Moving on-premises databases to cloud services (e.g., AWS RDS, Azure SQL).
Data Migration
• Title: Migrating Library Management Data from MS Access to MySQL

• Steps:

• Analyze the current Access schema.

• Design the target schema in MySQL.

• Export data to CSV using Access.

• Use LOAD DATA INFILE in MySQL to import data.

• Validate foreign key constraints, relationships.

• Run sample queries to verify integrity.


Data Migration
• Cross-Platform Old Database Migration in DBMS refer to migrating legacy data from older
database systems or platforms (e.g., FoxPro, MS Access, dBase, IBM DB2) to modern
RDBMS or cloud-based platforms (e.g., MySQL, PostgreSQL, SQL Server, Oracle, MongoDB,
or cloud DBMS like AWS RDS).
• To migrate and modernize an existing legacy database system by:
• Preserving data integrity, Upgrading to a scalable, secure platform, Ensuring compatibility
with modern applications.
Legacy DB Modern Target DB
FoxPro/dBase MySQL/PostgreSQL

• Typical Use MS Access


Oracle 9i
SQL Server
Oracle 19c / PostgreSQL
Cases IBM DB2 Azure SQL / Snowflake

File-based DB Cloud-native RDBMS


• Steps in a Cross-Platform Migration Data Migration
Project • 3. Tool Selection & Data Extraction
Tools for old DBs: DBF Viewer/Converter,
• 1. Legacy System Audit
Microsoft Access Export, ODBC drivers (for
• Identify old database engine connectivity)
(e.g., .dbf FoxPro files). Export to CSV, SQL dumps, or via ETL tools.
• Understand table structure, • 4. Data Transformation
relationships, constraints. • Cleanse and normalize data.
• Review any stored procedures, • Resolve encoding issues (e.g., ASCII to UTF-
views, or triggers. 8).
• Apply transformations: date formats, NULL
• 2. Schema Mapping handling, etc.
• Map legacy data types to target • 5. Data Loading into New DB
DB data types. • Use:
• Example: Memo in Access → • LOAD DATA INFILE (MySQL),
• COPY (PostgreSQL),
TEXT in MySQL • ETL tools (Talend, Pentaho)
• Identify unsupported features • Python scripts using Pandas + SQLAlchemy
and plan replacements. • 6. Data Validation
• Row count match
• Resolve differences in indexing • Data type check
and constraints. • Referential integrity verification
Data Migration
• Challenges • Challenges
• Lack of documentation on old • Data loss or corruption
systems.
• Downtime or performance
• Missing primary keys or
normalization issues. issues
• Binary/encoded formats or • Schema incompatibility
special character sets. • Data security and
• Procedural logic compliance
(triggers/functions)
incompatible. • Large data volumes or
• Performance tuning required complex relationships
after migration.
• Referential integrity
verification
Query Optimization 2
• A Query Optimization Project in DBMS focuses on analyzing and
improving SQL queries to enhance performance, reduce execution
time, and lower resource usage especially important for large datasets or
high-traffic applications.
• Objectives:
• Improve response time of complex SQL queries.
• Reduce CPU, memory, and disk I/O consumption.
• Compare performance before and after optimization using metrics like
execution time and cost.
• Project Scope:
• Implement unoptimized and optimized queries on a large sample
database (e.g., employees, e-commerce, university).
• Use tools like EXPLAIN PLAN, SQL Profiler, ANALYZE, or Query
Execution Plans.
• Optimize queries using indexing, rewriting, joins, subqueries, and
partitioning.
Query Optimization Query Unoptimized Optimized Index
Improvement
• Performance Analysis (Baseline) ID Time Time Used
Q1 12.5 sec 1.4 sec 88.8% Yes
• Measure: Q2 8.3 sec 2.6 sec 68.6% No
• Execution time
• Cost in EXPLAIN PLAN
• Number of rows scanned
• CPU and I/O stats (optional)
Technique Description
Indexes Create indexes on filter/search columns
Join optimization Use appropriate join types and ordering
Avoid SELECT* Use only required columns
Query restructuring Convert subqueries to joins
Partitioning Use horizontal partitioning for large tables
Materialized Views Pre-compute expensive joins or aggregates
Use of ANALYZE/STATS Ensure query planner has up-to-date data
Relational Synthesis
• What is Relational Synthesis? 3
• Process of generating normalized relational schemas from:
• ER diagrams
• Functional dependencies (FDs)
• Views or queries
• Legacy flat files or data sources
• Higher-level abstractions (e.g., user inputs, form data)
• Types of Relational Synthesis Projects
• 1. ER-to-Relational Schema Conversion
• Convert ER diagrams into relational tables using mapping rules.
• 2. View/Query-Based Synthesis
• Generate base relations from a set of user-defined views or SQL queries (used in data integration
and query rewriting).
• 3. Dependency-Preserving Decomposition
• Synthesize a normalized schema (e.g., 3NF or BCNF) from functional dependencies.
• 4. Data Integration & Source Mapping
• Create a unified relational schema from multiple heterogeneous data sources.
Relational Synthesis
• Example: "Automatic Relational Schema Generator from ER Diagrams"
• Input: Entity-Relationship diagrams (text-based or graphical)
• Output: SQL DDL statements (CREATE TABLE ...)
• Features:
• Mapping rules for entity, relationship, multivalued attributes
• Normalization check (up to 3NF)
• "Functional Dependency-Based Schema Normalization Tool"
• Input: Set of attributes and functional dependencies
• Output: Decomposed schema in 3NF or BCNF
• Features:
• Attribute closure calculator
• Candidate key detection
• Dependency-preserving decomposition
Relational Synthesis
• FD-Based Synthesis
• Input: R(A, B, C, D, E)
• F = { A → B, B → C, A → D, D → E }
• Find a decomposition into 3NF relations that preserves FDs and allows
lossless join.
• Output:
• R1(A, B, D)
• R2(B, C)
• R3(D, E)
• "Reverse Engineering Legacy Flat Files to Relational Schemas"
• Input: CSV or Excel files
• Output: Normalized relational schema
• Feature: Column data profiling, inference of relationships
Data integration
• It involves combining data from multiple sources into a unified and
4
consistent format, typically stored in a relational schema.
• Objectives
• Combine multiple data sources into a centralized RDBMS.
• Maintain data consistency and integrity.
• Enable cross-source querying and analytics.
• Clean and transform data to remove duplication or conflicts.
• 1. “University Centralized Database Integration”
• Goal: Integrate data from various departmental systems (admissions,
academics, library) into a unified student information system.
• Sources:
•MS Access for admissions
•Excel sheets for library records
•MySQL database for academics
• Target: Centralized MySQL/PostgreSQL relational schema
Data integration
• 2. “E-Commerce Multi-Store Data Integration”
4
• Goal: Integrate sales and product data from different platforms (e.g.,
Shopify, Amazon, Local POS) into a relational warehouse.
• Sources:
•API JSON exports (Shopify)
•CSV downloads (Amazon)
•MySQL POS database
• Target: Product, Sales, Customer schema in RDBMS
• Challenges
• Data quality issues (missing/duplicate records)
• Conflicting keys or naming conventions
• Real-time vs batch integration
• Schema mapping inconsistencies
Query Synthesis from Natural Language 5
• It is the process of automatically converting natural language (NL)
input (e.g., “Show all students in Computer Science”) into a formal
SQL query that retrieves data from a relational database.
• Objectives
• Takes natural language input from the user,
• Understands the intent and structure of the question,
• Maps it to the underlying relational database schema,
• Generates and executes an accurate SQL query.
• Use Cases
• Querying databases without knowing SQL
• AI-powered BI dashboards
• Voice-based database querying
• Chatbots with database backend
• Accessibility support for non-technical users
Query Synthesis from Natural Language 5
• 1. NL2SQL Engine for Student Database
• Goal: Convert questions like “List all students in Computer department” into
SQL queries.
• Schema Example:
• students(student_id, name, dept, email)
• courses(course_id, name, credits)
• enrollments(student_id, course_id, grade)
• Input (NL):
• "What are the names of students who have enrolled in a database course?"
• Output (SQL):
•SELECT s.name
•FROM students s
•JOIN enrollments e ON s.student_id = e.student_id
•JOIN courses c ON e.course_id = c.course_id
•WHERE c.name LIKE '%database%';
Query Synthesis from Natural Language 5
• 2. Voice Query Interface for Sales Database
• Use speech-to-text + NL to SQL
• Useful for sales managers who want to “ask the data”
• E.g., “Total sales last month by region”

• 3. Intelligent Chatbot for Employee Records


• Chatbot backend connects to an HR database
• Questions like:
• “How many employees joined after 2020?”
• “Who is the manager of Rahul Sharma?”
Query Synthesis from Natural Language 5
• Challenges
• Ambiguity in user input
• Synonyms and plural handling (“students” vs “student”)
• Complex joins across multiple tables
• Handling GROUP BY, ORDER BY, and aggregation (SUM, AVG)
• Schema generalization (dynamic DB schemas)
Query Simulation 6
• Query simulation refers to creating an environment where users can
simulate the execution of SQL queries without affecting actual
production data. It’s used for:
•Learning and training
•Query performance prediction
•Understanding how SQL logic works
•Visualizing query execution
•Practicing complex operations (joins, subqueries, groupings)
• Objective
• To build a Query Simulation System that:
• Accepts SQL queries from users
• Executes them on simulated/sample data
• Shows step-by-step results or execution plans
Query Simulation 6
• 1. Interactive SQL Query Simulator for Education
• Goal: Help students learn how SQL works by showing query
breakdown.
• Features:
• Enter SELECT, JOIN, GROUP BY queries
• Simulate step-by-step:
• Table scan
• Filter application
• Join results
• Grouping and aggregation
• Highlight intermediate outputs
AI in RDBMS
• Smart Index Recommender
7
• Goal: Analyze a set of queries and recommend indexes
• Input:
• Query logs
• Schema
• Output:
• Suggested indexes with reasoning
• Performance comparison
• Optimizer-Aware Query Rewriter (with AI Integration)
• Goal: Use a simple rule-based or AI approach (like GPT or BERT) to rewrite
slow queries based on schema knowledge
• Example:
• Input Query: SELECT * FROM orders WHERE customer_id IN (SELECT
customer_id FROM customers WHERE region = 'East')
• Optimized: Use JOIN with index instead of subquery
Transaction Management 8
• Atomicity: All operations in a transaction succeed or none do.
• Consistency: The database moves from one valid state to another.
• Isolation: Concurrent transactions do not interfere.
• Durability: Once committed, the transaction is permanent.
• Objectives
• Demonstrates transaction execution
• Implements rollback, commit, concurrent access
• Simulates real-life scenarios (e.g., banking, inventory)
• Ensures data integrity during failures or conflicts
• Examples:
• 1. Banking System with Transaction Control
• 2. Hotel Booking System with Rollback Support-concurrent room
bookings
• 3. Transaction Deadlock Detection and Recovery
Relation Normalization 9
• Normalization is the process of organizing data in a database to
reduce redundancy and improve data integrity by dividing large
tables into smaller, well-structured tables and defining relationships
between them.
• The main normal forms are:
• 1NF: Eliminate repeating groups and ensure atomic values
• 2NF: Remove partial dependencies (only for composite keys)
• 3NF: Remove transitive dependencies
• BCNF, 4NF, 5NF: Advanced refinements
• Examples:
• 1. College Database Normalization
• 2. E-Commerce Order Database Normalization
• 3. Hospital Patient Record Normalization
Mapping of Relational Algebra to SQL project 10
• This project explores how Relational Algebra (RA) operations are
translated into equivalent SQL queries. It helps students and developers:
• Understand the theoretical foundation of SQL
• Map RA expressions (π, σ, ×, ⋈, ∪, −) to real-world SQL
• Build an interactive or demonstrative system for learning/teaching
• Objectives
• Demonstrate how each Relational Algebra operation maps to SQL.
• Build a system that:
• Accepts relational algebra queries
• Converts them to SQL
• Executes them on sample databases
• Relational Algebra to SQL Translator Tool
• Input RA:
• π name (σ age > 18 (Students))
• Output SQL:
• SELECT name FROM Students WHERE age > 18;
Embedded SQL 11
• Embedded SQL allows you to integrate SQL statements directly into a host programming language
like C, C. It enables you to:
• Execute SQL inside programs
• Combine database operations with procedural logic
• Perform advanced query-based data manipulation within applications
• Example : Student Record Management System
• Language: C/C++ with Embedded SQL Database: PostgreSQL / Oracle / MySQL
• Features:
• Add / Update / Delete student records
• Search by student ID or name
• Store attendance and marks
• Show GPA, average scores, etc.
• Learning Outcome:
• Embedded SELECT, INSERT, UPDATE
• Cursor-based loops in embedded SQL
• Dynamic SQL execution

You might also like