Department of Computer Science & Engineering
Academic Year: 2024-2025 EVEN
20 - 2 Mark Questions with Answers
Faculty Name : Dr. RINESH.S
Subject Code & Name : CCS341 DATA WAREHOUSING
Year / Semester : III/VI
Degree & Branch : B.E CSE A and B SEC
Unit I: Introduction to Data Warehouse
1. Define a data warehouse.
A data warehouse is a centralized repository that stores historical data for analysis and
reporting.
2. What are the key features of a data warehouse?
Subject-oriented, integrated, non-volatile, and time-variant.
3. What is the difference between operational databases and data warehouses?
Operational databases handle daily transactions, while data warehouses focus on
analytical queries.
4. Explain the term 'time-variant' in a data warehouse.
Data warehouses store data with timestamps, enabling trend analysis over time.
5. What are the three layers of data warehouse architecture?
Bottom Tier (Data Sources), Middle Tier (OLAP Server), Top Tier (Query Tools).
6. What is metadata in a data warehouse?
Metadata describes the structure, source, and meaning of data in a warehouse.
7. What are the main components of a data warehouse?
Data Sources, ETL Processes, Data Storage, Metadata, and Query Tools.
8. Differentiate data warehouse and data mart.
A data warehouse is enterprise-wide, while a data mart focuses on a specific
department or function.
9. What is an autonomous data warehouse?
It is a self-managing data warehouse that automates optimization, scaling, and
backups.
10. What is a modern data warehouse?
A cloud-based solution integrating structured and unstructured data for analytics.
11. Why is integration important in a data warehouse?
It ensures consistent and unified data from various sources.
Form No. AC 5a Rev.No. 00 Effective Date: 05/07/2023
12. What are the challenges in building a data warehouse?
Data integration, scalability, performance optimization, and user adoption.
13. What are OLTP and OLAP?
OLTP (Online Transaction Processing) is for day-to-day operations, while OLAP
(Online Analytical Processing) is for analysis.
14. What is the purpose of ETL in a data warehouse?
ETL extracts, transforms, and loads data from various sources into a data warehouse.
15. What is the role of a data warehouse administrator?
They manage the warehouse's performance, security, and maintenance.
16. What is a data mart?
A small, subject-specific data repository within a data warehouse.
17. Explain subject-oriented data.
Data organized around key subjects like customers, sales, or products.
18. What is non-volatile data in a warehouse?
Data remains unchanged once stored and is only appended with new updates.
19. What is the role of a query tool in a data warehouse?
It allows users to extract and visualize data for analysis.
20. Explain the concept of a data warehouse lifecycle.
It involves data extraction, transformation, loading, querying, and maintenance.
Unit II: ETL and OLAP Technology
1. Define ETL.
ETL stands for Extract, Transform, Load – the process of preparing data for analysis.
2. What is ELT, and how does it differ from ETL?
ELT performs transformations after loading data into the target system, unlike ETL.
3. What are the main steps of the ETL process?
Extraction, transformation, and loading.
4. Define OLAP.
OLAP (Online Analytical Processing) enables users to perform multidimensional
analysis on data.
5. What are OLAP operations?
Drill-down, roll-up, slice, dice, and pivot.
6. Differentiate OLAP and OLTP.
OLAP is for analysis, and OLTP is for transactional processing.
7. What is MOLAP?
MOLAP (Multidimensional OLAP) uses pre-computed data cubes for analysis.
8. What is ROLAP?
ROLAP (Relational OLAP) stores data in relational databases.
Form No. AC 5a Rev.No. 00 Effective Date: 05/07/2023
9. What is HOLAP?
HOLAP (Hybrid OLAP) combines MOLAP and ROLAP features.
10. What is an OLAP cube?
A multidimensional structure that allows data to be modeled and viewed in multiple
dimensions.
11. What are the characteristics of OLAP?
Fast analysis, multidimensional data, and intuitive visualization.
12. What is the difference between static and dynamic ETL?
Static ETL processes fixed data, while dynamic ETL adapts to changing data sources.
13. What is a staging area in ETL?
A temporary storage area for data transformation before loading.
14. What are the benefits of OLAP?
Improved decision-making, faster query performance, and better data visualization.
15. What is a delivery process in ETL?
The step where transformed data is loaded into the target system.
16. What is slicing in OLAP?
Selecting a specific dimension for focused analysis.
17. What is dicing in OLAP?
Creating a smaller subset of data by selecting specific attributes from dimensions.
18. What is the role of ETL tools?
Automating the process of extracting, transforming, and loading data.
19. What are the challenges of the ETL process?
Data quality, scalability, and handling large volumes of data.
20. What is the role of metadata in OLAP?
Describes the structure, dimensions, and measures of OLAP cubes.
Unit III: Metadata, Data Mart, and Partition Strategy
1. What is metadata in data warehousing?
Metadata is data that provides information about other data, describing structure,
source, and usage.
2. What are the types of metadata?
Technical, business, and operational metadata.
3. What is the role of metadata in a data warehouse?
It helps in data integration, consistency, and user understanding of data.
4. What is a metadata repository?
A centralized location to store and manage metadata.
5. What are the challenges in metadata management?
Integration, consistency, security, and maintenance.
Form No. AC 5a Rev.No. 00 Effective Date: 05/07/2023
6. What is a data mart?
A subset of a data warehouse tailored for a specific business line or department.
7. What are the advantages of data marts?
Faster query performance, cost-effectiveness, and focused analysis.
8. What is the cost of data marts?
The cost includes hardware, software, and ongoing maintenance.
9. What is partitioning in data warehousing?
Partitioning divides large datasets into smaller, manageable segments.
10. What are the types of partitioning?
Horizontal, vertical, and range-based partitioning.
11. What is vertical partitioning?
Splitting a table into smaller tables by columns.
12. What is horizontal partitioning?
Dividing a table into smaller tables by rows.
13. What is row splitting in partitioning?
Breaking a large dataset into smaller subsets based on specific rows.
14. What is normalization in data warehousing?
Organizing data to reduce redundancy and improve efficiency.
15. What are the benefits of partitioning?
Improved query performance, scalability, and easier data management.
16. What are the challenges in partitioning?
Complexity in design, maintenance, and increased storage requirements.
17. What is a metadata-driven ETL process?
An ETL process that uses metadata to control data transformations and workflows.
18. Why is metadata critical for ETL?
It ensures data consistency and traceability throughout the ETL process.
19. What is cost-effective data mart design?
Creating data marts with minimal resources while maximizing performance.
20. What is the role of metadata in data marts?
Metadata provides details on data lineage, structure, and usage.
Unit IV: Dimensional Modeling and Schema
1. What is dimensional modeling?
A technique to structure data for easy querying, using facts and dimensions.
2. What is a fact table?
A table containing numerical data used for analysis.
3. What is a dimension table?
A table containing descriptive attributes related to facts.
Form No. AC 5a Rev.No. 00 Effective Date: 05/07/2023
4. What is a star schema?
A schema with a central fact table surrounded by dimension tables.
5. What is a snowflake schema?
A schema with normalized dimension tables.
6. Differentiate star and snowflake schemas.
Star schema is denormalized, while the snowflake schema is normalized.
7. What is a fact constellation schema?
A schema with multiple fact tables sharing dimension tables.
8. What is a data cube?
A multidimensional representation of data for OLAP.
9. What are the operations on a data cube?
Roll-up, drill-down, slicing, dicing, and pivoting.
10. What is roll-up in a data cube?
Aggregating data along a dimension.
11. What is drill-down in a data cube?
Breaking data into finer granularity.
12. What is slicing in a data cube?
Selecting specific dimensions for focused analysis.
13. What is dicing in a data cube?
Creating a smaller cube by selecting attributes from dimensions.
14. What is a schema definition?
The logical structure of a database, defining tables, relationships, and attributes.
15. What is process architecture in data warehousing?
The structure for managing data flows and operations in a warehouse.
16. What is parallelism in a database?
Simultaneous execution of tasks to improve performance.
17. What are the types of database parallelism?
Inter-query, intra-query, and intra-operation parallelism.
18. What are dimensional tools in data warehousing?
Tools for designing and managing dimensions in a data warehouse.
19. What is schema normalization?
The process of organizing schema to reduce redundancy.
20. Why is dimensional modeling used in data warehouses?
To simplify data structure and enhance query performance.
Unit V: System and Process Managers
1. What is a system manager in a data warehouse?
A component managing configurations, events, scheduling, and backups.
Form No. AC 5a Rev.No. 00 Effective Date: 05/07/2023
2. What is a process manager?
A component managing ETL processes, queries, and performance tuning.
3. What is the role of a system configuration manager?
Configures and maintains hardware and software settings.
4. What is a system scheduling manager?
Schedules data extraction, transformation, and loading processes.
5. What is a system event manager?
Tracks and manages events occurring in the data warehouse system.
6. What is a load manager?
Handles the extraction and loading of data into the warehouse.
7. What is a query manager?
Optimizes and manages user queries for efficient execution.
8. What is tuning in a data warehouse?
Improving system performance by optimizing queries and configurations.
9. What is system backup recovery?
Ensuring data integrity through regular backups and restoration capabilities.
10. What is system testing in a data warehouse?
Verifying the warehouse's functionality, performance, and security.
11. What is warehouse testing?
Checking the accuracy of data transformations and load processes.
12. What are the challenges of backup and recovery in a data warehouse?
Managing large datasets and ensuring minimal downtime.
13. What is the role of a warehouse manager?
Oversees the overall operations of a data warehouse.
14. What is process tuning?
Adjusting processes to optimize data flows and reduce latency.
15. What is a data warehouse query log?
A log recording all user queries for auditing and optimization.
16. What are the key functions of a system event manager?
Monitoring, logging, and handling system events and errors.
17. What is the importance of query optimization?
Reduces execution time and improves resource utilization.
18. What is system scalability?
The ability to handle increased workloads without performance degradation.
19. What is the role of data warehouse security?
Protects sensitive data from unauthorized access and breaches.
20. What is a data warehouse disaster recovery plan?
A strategy to restore operations after major system failures.
Form No. AC 5a Rev.No. 00 Effective Date: 05/07/2023