[go: up one dir, main page]

0% found this document useful (0 votes)
70 views18 pages

Unit 1 - MCQ

The document consists of multiple-choice questions related to data warehousing concepts, including characteristics, architecture, ETL processes, and data modeling. Key topics covered include the purpose of data warehouses, the role of different layers in architecture, and definitions of terms like ETL and metadata. Correct answers are provided for each question, highlighting essential knowledge for understanding data warehousing.

Uploaded by

devarajdony2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views18 pages

Unit 1 - MCQ

The document consists of multiple-choice questions related to data warehousing concepts, including characteristics, architecture, ETL processes, and data modeling. Key topics covered include the purpose of data warehouses, the role of different layers in architecture, and definitions of terms like ETL and metadata. Correct answers are provided for each question, highlighting essential knowledge for understanding data warehousing.

Uploaded by

devarajdony2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Multiple Choice Question

Unit 1

Which of the following is NOT a characteristic of a data warehouse


A. Integrated
B. Time-variant
C. Volatile
D. Subject-oriented
 ✅ Answer: C. Volatile
What is the main purpose of a data warehouse?
A. Perform real-time transactions
B. Store current data for operations
C. Support decision-making through historical data analysis
D. Replace operational databases
 ✅ Answer: C. Support decision-making through historical data analysis
The term “subject-oriented” in data warehousing means:
A. Data is organized around departments
B. Data is structured by business subjects like sales or finance
C. Data is categorized by system
D. Data is filtered in real-time
 ✅ Answer: B. Data is structured by business subjects like sales or finance
In a data warehouse, “time-variant” means:
A. Data changes constantly
B. Only recent data is stored
C. Historical data is stored for trend analysis
D. Data is deleted after a certain time
 ✅ Answer: C. Historical data is stored for trend analysis
What is the role of the bottom tier in a 3-tier data warehouse architecture?
A. Reporting and dashboards
B. Storing integrated data
C. Data extraction and staging
D. Query optimization
 ✅ Answer: C. Data extraction and staging
Which layer of a data warehouse is responsible for storing the actual historical
data?
A. Top tier
B. Staging area
C. Middle tier
D. Metadata repository
 ✅ Answer: C. Middle tier
Which tier is responsible for providing user access, reporting, and analysis
tools?
A. Top tier
B. Middle tier
C. Bottom tier
D. ETL layer
 ✅ Answer: A. Top tier

What is the purpose of a staging area in a data warehouse?


A. To store final reports
B. Temporary storage before transformation and loading
C. End-user interface
D. Hosting OLAP cubes
 ✅ Answer: B. Temporary storage before transformation and loading
Which of the following best describes “ETL”?

A. Extract, Transfer, and List


B. Extract, Transform, and Load
C. Extract, Transmit, and Load
D. Extract, Translate, and Load
 ✅ Answer: B. Extract, Transform, and Load
Why is data transformation important in ETL?
A. To compress data
B. To replicate source tables
C. To clean and standardize data before loading
D. To delete irrelevant data
 ✅ Answer: C. To clean and standardize data before loading
What is metadata in a data warehouse?
A. Actual business data
B. Temporary data for staging
C. Data about data (e.g., definitions, sources, structure)
D. Analytical results
 ✅ Answer: C. Data about data (e.g., definitions, sources, structure)
Which of the following is a benefit of using a data warehouse?
A. Handles high-speed transactional updates
B. Supports analytical and business decision-making
C. Requires minimal storage
D. Offers direct data entry for users
 ✅ Answer: B. Supports analytical and business decision-making
What does “non-volatile” mean in the context of a data warehouse?
A. Data changes continuously
B. Data is temporary and frequently deleted
C. Data is stable and not changed after entry
D. Data refreshes every second
 ✅ Answer: C. Data is stable and not changed after entry
Which of the following best represents the typical flow in data warehouse
architecture?
A. BI Tools → Data Sources → ETL → Data Warehouse
B. ETL → Data Warehouse → Data Sources
C. Data Sources → ETL → Data Warehouse → BI Tools
D. Data Sources → Data Mart → ETL → BI Tools
 ✅ Answer: C. Data Sources → ETL → Data Warehouse → BI Tools

What is the function of data marts?


A. Store raw source data
B. Handle transactional processing
C. Focus on specific departments or subject areas
D. Manage metadata
 ✅ Answer: C. Focus on specific departments or subject areas
What is the main goal of dimensional modeling in a data warehouse?
A. Reduce data redundancy
B. Improve transaction speed
C. Make data easy to understand and query
D. Normalize operational data
 ✅ Answer: C. Make data easy to understand and query
A fact table primarily contains:
A. Text descriptions
B. Business process measurements
C. Hierarchical relationships
D. OLTP transactions
 ✅ Answer: B. Business process measurements
Dimension tables provide:
A. Calculated values
B. Detailed, descriptive context for facts
C. Transactional data
D. Numeric-only data
 ✅ Answer: B. Detailed, descriptive context for facts
Which of the following best describes a star schema?
A. A normalized model with many joins
B. A model with hierarchical relationships
C. A central fact table with denormalized dimension tables
D. Fact tables only
 ✅ Answer: C. A central fact table with denormalized dimension tables
What makes the snowflake schema different from the star schema?
A. Fact table is removed
B. Dimension tables are normalized
C. More foreign keys in the fact table
D. Better suited for OLTP systems
 ✅ Answer: B. Dimension tables are normalized
A surrogate key is:
A. The natural key from a source system
B. A calculated measure in a fact table
C. A system-generated unique identifier
D. A key used for internal database optimization
 ✅ Answer: C. A system-generated unique identifier
In dimensional modeling, what is a degenerate dimension?
A. A measure with no dimension
B. A dimension key stored in the fact table with no separate dimension table
C. A primary key that is always null
D. A column not used in queries
✅ Answer: B. A dimension key stored in the fact table with no separate
dimension table
Which schema is ideal when multiple fact tables share common dimensions?
A. Star schema
B. Snowflake schema
C. Constellation schema (Galaxy schema)
D. Flat schema
 ✅ Answer: C. Constellation schema (Galaxy schema)
Which of the following is NOT a valid measure type in a fact table?
A. Additive
B. Semi-additive
C. Non-additive
D. Descriptive
 ✅ Answer: D. Descriptive
Which of the following is an example of a fact?
A. Product category
B. Customer region
C. Total sales amount
D. Store address
 ✅ Answer: C. Total sales amount
Which of the following would most likely be found in a dimension table?
A. Revenue
B. Quantity sold
C. Order date
D. Customer name
 ✅ Answer: D. Customer name
What type of schema uses more storage but is simpler to query?
A. Snowflake schema
B. Star schema
C. Factless schema
D. Third-normal form schema
 ✅ Answer: B. Star schema
Fact tables are typically:
A. Small and slow-growing
B. Large and fast-growing
C. Hierarchical in structure
D. Flat and rarely accessed

 ✅ Answer: B. Large and fast-growing


Which dimension is likely shared across many fact tables?
A. Product
B. Date
C. Region
D. Store
 ✅ Answer: B. Date
What kind of fact table stores the result of measurements (e.g., sales, costs)?
A. Transactional fact table
B. Periodic snapshot fact table
C. Accumulating snapshot fact table
D. Aggregated dimension table
 ✅ Answer: A. Transactional fact table
What does ETL stand for?
A. Execute, Test, Load
B. Extract, Transfer, Load
C. Extract, Transform, Load
D. Execute, Transform, Load
Answer: C) Extract, Transform, Load
The process of retrieving data from various source systems is known as:
A. Transformation
B. Loading
C. Extraction
D. Data Cleansing
Answer: C) Extraction
Which phase of ETL involves cleaning, standardizing, and converting data into a
suitable format for the target system?
A. Extraction
B. Loading
C. Transformation
D. Data Validation
Answer: C) Transformation
Writing data into a target database or data warehouse is called:
A. Extracting
B. Transforming
C. Loading
D. Data Auditing
Answer: C) Loading
A temporary storage area used during the ETL process to hold extracted data
before transformation is called a:
A. Data Mart
B. Data Warehouse
C. Staging Area
D. OLAP Cube
Answer: C) Staging Area

Which of the following is NOT a common data transformation technique?


A. Data Cleansing
B. Data Type Conversion
C. Data Deletion (as a primary transformation goal)
D. Data Aggregation
Answer: C) Data Deletion (as a primary transformation goal)

Which of the following is a common challenge faced in ETL processes?


A. Data Quality
B. Data Volume
C. Data Complexity
D. All of the above
Answer: D) All of the above

The process of combining data from multiple source systems into a unified
format within a data warehouse is known as:
A. Data Cleaning
B. Data Integration
C. Data Transformation
D. Data Reduction
Answer: B) Data Integration

Which loading strategy is more efficient for larger datasets and handles updates
more effectively?
A. Full ETL
B. Incremental ETL
C. Batch ETL
D. Real-time ETL
Answer: B) Incremental ETL

Data mapping, which defines the relationships between source and target data
fields, occurs during which ETL phase?
A. Extraction
B. Transformation
C. Loading
D. All of the above
Answer: B) Transformation

In ETL, the “E” stands for:

A. Encode
B. Execute
C. Extract
D. Enrich
 ✅ Answer: C. Extract

Which of the following is a transformation task in ETL?

A. Retrieving raw data


B. Loading data into a report
C. Converting currency formats
D. Saving files to disk
 ✅ Answer: C. Converting currency formats

What is the final step in the ETL process?


A. Extraction
B. Transformation
C. Validation
D. Loading
 ✅ Answer: D. Loading

Which of the following is an example of an ETL tool?


A. Microsoft Word
B. Apache Hadoop
C. Informatica
D. Tableau
 ✅ Answer: C. Informatica

What is the goal of the transformation step in ETL?


A. Retrieve data from external sources
B. Improve query performance
C. Convert and clean data before loading
D. Archive old data
 ✅ Answer: C. Convert and clean data before loading

Which loading method is faster but may cause data loss if interrupted?
A. Incremental load
B. Full load
C. Parallel load
D. Lazy load
 ✅ Answer: B. Full load

Which type of ETL load is best when only new or changed records are
transferred?
A. Static load
B. Full load
C. Incremental load
D. Periodic load
 ✅ Answer: C. Incremental load

Why is data cleansing important during the ETL transformation stage?


A. To reduce storage costs
B. To hide sensitive data
C. To improve the quality and reliability of data
D. To convert text into images

 ✅ Answer: C. To improve the quality and reliability of data

What is a staging area used for in ETL?


A. Data visualization
B. Temporary data processing and storage
C. End-user reporting
D. API integration

 ✅ Answer: B. Temporary data processing and storage

Which of the following is NOT a typical ETL transformation?


A. Filtering rows
B. Changing column names
C. Loading data into Excel
D. Merging data from multiple sources
 ✅ Answer: C. Loading data into Excel

Which of the following is a cloud-based data warehouse solution?


A. MySQL
B. Oracle 11g
C. Amazon Redshift
D. MongoDB
 ✅ Answer: C. Amazon Redshift

What is the main function of ETL tools in data warehousing?


A. Run ad-hoc queries
B. Format dashboards
C. Extract, transform, and load data
D. Encrypt sensitive data
 ✅ Answer: C. Extract, transform, and load data

Which of the following is a popular business intelligence (BI) tool?


A. SSIS
B. Erwin
C. Tableau
D. AWS Glue
 ✅ Answer: C. Tableau

Which tool is typically used for data modeling?


A. Power BI
B. Erwin
C. Redshift
D. QlikView
 ✅ Answer: B. Erwin

Microsoft’s ETL tool is called:

A. SQL Server Reporting Services


B. Azure Synapse
C. SSIS
D. SSRS
 ✅ Answer: C. SSIS

What does metadata help with in a data warehouse?


A. Running SQL queries faster
B. Describing data structure, source, and meaning
C. Reducing disk space
D. Encrypting data in transit
 ✅ Answer: B. Describing data structure, source, and meaning

Which of the following best describes Snowflake (the platform)?


A. An OLTP database
B. A cloud-native data warehouse
C. An open-source visualization tool
D. A metadata management tool
 ✅ Answer: B. A cloud-native data warehouse

What is the main advantage of using cloud data warehouse platforms?


A. Requires physical server setup
B. Fixed scalability and pricing
C. Elastic scaling and managed infrastructure
D. Only supports structured data
 ✅ Answer: C. Elastic scaling and managed infrastructure
Power BI is used primarily for:
A. Data backup
B. Business analytics and reporting
C. Real-time data extraction
D. Data encryption
 ✅ Answer: B. Business analytics and reporting
Which tool is used for managing data lineage and governance?
A. Tableau
B. SSIS
C. Collibra
D. Redshift
 ✅ Answer: C. Collibra
What is a Data Warehouse?
a) A place where files are stored
b) A system used for daily transactions
c) A subject-oriented, integrated, time-variant, and non-volatile data collection
d) A backup of operational systems
✔️ Answer: c
The main purpose of a data warehouse is to:
a) Store log files
b) Perform online transactions
c) Support decision-making
d) Store email records
✔️ Answer: c
Which of the following is NOT a characteristic of a data warehouse?
a) Subject-oriented
b) Time-variant
c) Real-time update
d) Non-volatile
✔️ Answer: c
Data Warehouse vs Database (OLTP vs OLAP)
Operational systems are also called as:
a) OLAP
b) OLTP
c) DWH
d) Data mining
✔️ Answer: b
Which one is used for fast query and analysis?
a) OLTP
b) OLAP
c) ERP
d) CRM
✔️ Answer: b
In a Data Warehouse, data is mostly:
a) Updated frequently
b) Used for transactions
c) Historical
d) Deleted regularly
✔️ Answer: c
Which of the following is NOT a component of data warehouse architecture?
a) Data Source
b) ETL Tools
c) Data Mart
d) Compiler
✔️ Answer: d
What does ETL stand for?
a) Extract, Transform, Load
b) Extract, Transfer, Load
c) Encode, Transform, Load
d) Export, Translate, Load
✔️ Answer: a
The data in a Data Warehouse is stored in:
a) Operational databases
b) Spreadsheets
c) Multidimensional databases
d) Transactional tables
✔️ Answer: c
How many layers are usually present in a Data Warehouse architecture?
a) 2
b) 3
c) 4
d) 5
✔️ Answer: b
Which of the following is the first layer in Data Warehouse architecture?
a) Data access layer
b) Data source layer
c) ETL layer
d) Data presentation layer
✔️ Answer: b
In which layer of DWH architecture is data cleaned and transformed?
a) Source layer
b) ETL layer
c) Data access layer
d) Application layer
✔️ Answer: b
A Data Mart is:
a) A complete data warehouse
b) A subset of a data warehouse
c) A transactional system
d) A data modeling tool
✔️ Answer: b
Which of these is a type of data mart?
a) Local Mart
b) Dependent Data Mart
c) Logical Mart
d) Temporary Mart
✔️ Answer: b
What is metadata in data warehousing?
a) Actual user data
b) Graphical reports
c) Data about data
d) Error logs
✔️ Answer: c
Which of the following is an example of a data source for a data warehouse?
a) CRM System
b) ERP System
c) Flat files
d) All of the above
✔️ Answer: d
Which is an advantage of a data warehouse?
a) Faster daily transactions
b) Better decision support
c) Real-time processing
d) Data deletion
✔️ Answer: b
Data Warehousing helps in:
a) Sending emails
b) Quick backups
c) Business Intelligence
d) Managing passwords
✔️ Answer: c
The data stored in a warehouse is usually:
a) Current only
b) Deleted daily
c) Real-time
d) Historical and analytical
✔️ Answer: d
Which system uses current operational data?
a) Data warehouse
b) OLAP
c) OLTP
d) Data mart
✔️ Answer: c
What is dimensional modeling used for in data warehousing?
a) Transaction processing
b) Data encryption
c) Fast data retrieval for analysis
d) Data deletion
✔️ Answer: c
A dimensional model consists of:
a) Only tables
b) One fact table and many dimension tables
c) Only facts
d) Only dimensions
✔️ Answer: b
Which of the following is a key advantage of dimensional modeling?
a) Normalization
b) Slow performance
c) Complex design
d) Simplicity and fast queries
✔️ Answer: d
What does a fact table contain?
a) Descriptions only
b) Textual data
c) Measurable numeric data (facts)
d) Metadata only
✔️ Answer: c
Which of the following is usually found in a fact table?
a) Product Name
b) Sales Amount
c) Customer City
d) Country Name
✔️ Answer: b
Fact table has:
a) Only primary keys
b) Only descriptive data
c) Foreign keys and measures
d) Only metadata
✔️ Answer: c
Fact tables are usually:
a) Small
b) Medium-sized
c) Large
d) Temporary
✔️ Answer: c
Dimension tables contain:
a) Numeric data
b) Descriptive data (attributes)
c) Foreign keys only
d) Computed measures
✔️ Answer: b
Which table answers 'who', 'what', 'when', 'where' questions?
a) Fact table
b) Summary table
c) Dimension table
d) Lookup table
✔️ Answer: c
An example of a dimension table is:
a) Sales Table
b) Date Table
c) Invoice Table
d) Fact Table
✔️ Answer: b
Which of the following is NOT a type of fact table?
a) Transaction fact
b) Snapshot fact
c) Accumulating fact
d) Virtual fact
✔️ Answer: d
Snapshot fact tables record:
a) Daily backups
b) Regular measurements over time
c) Only first transaction
d) Text data
✔️ Answer: b
Which fact table type is used to store results of ongoing processes?
a) Snapshot
b) Accumulating
c) Transaction
d) Dimension
✔️ Answer: b
A star schema contains:
a) Only normalized tables
b) One fact table and directly connected dimension tables
c) Multiple fact tables
d) No relationships
✔️ Answer: b
In snowflake schema, dimension tables are:
a) Denormalized
b) Normalized
c) Removed
d) Deleted
✔️ Answer: b
Which schema is easier to understand and query?
a) Star Schema
b) Snowflake Schema
c) Galaxy Schema
d) ER Diagram
✔️ Answer: a
What connects fact and dimension tables?
a) Primary keys
b) Alternate keys
c) Foreign keys
d) Clustered keys
✔️ Answer: c
A surrogate key is:
a) A natural key
b) A meaningful key
c) An artificial key with no business meaning
d) A text-based key
✔️ Answer: c
ETL stands for:
a) Extract, Translate, Load
b) Export, Transform, Load
c) Extract, Transform, Load
d) Encode, Transfer, Load
✔️ Answer: c
ETL is mainly used in:
a) OLTP systems
b) Programming IDEs
c) Data Warehousing
d) E-commerce websites
✔️ Answer: c
The purpose of ETL is to:
a) Send emails
b) Move and prepare data for analysis
c) Run SQL queries
d) Format hard drives
✔️ Answer: b
Data extraction means:
a) Storing emails
b) Reading data from various sources
c) Deleting data
d) Encrypting data
✔️ Answer: b
Common data sources for extraction include:
a) Relational databases
b) Excel files
c) Web APIs
d) All of the above
✔️ Answer: d

Which of the following is NOT a data extraction source?


a) CRM system
b) Email inbox
c) ERP system
d) Flat files
✔️ Answer: b
In ETL, transformation means:
a) Moving data
b) Changing data format, structure, or values
c) Deleting unwanted data
d) Hiding the data
✔️ Answer: b

Which of the following is an example of data transformation?


a) Changing date format from MM-DD-YYYY to YYYY-MM-DD
b) Creating backups
c) Removing folders
d) Printing reports
✔️ Answer: a

Which operation is part of data transformation?


a) Data mapping
b) Data filtering
c) Data summarization
d) All of the above
✔️ Answer: d

Why is transformation needed?


a) For storing raw data
b) For better performance in source systems
c) To make data clean and usable in the warehouse
d) To delete old records
✔️ Answer: c
Data loading means:
a) Sending emails
b) Inserting data into the data warehouse
c) Extracting data from warehouse
d) Backing up data
✔️ Answer: b

Which is a type of data load?


a) Full load
b) Incremental load
c) Both a and b
d) None
✔️ Answer: c

Full load means:


a) Loading only changed records
b) Loading deleted records
c) Loading all data every time
d) Loading errors
✔️ Answer: c

Incremental load means:


a) Loading all data again
b) Loading only new or updated data
c) Deleting existing data
d) Transforming schema
✔️ Answer: b

Which of the following is an ETL tool?


a) MS Word
b) Power BI
c) Talend
d) Paint
✔️ Answer: c

Which programming languages are often used in ETL scripts?


a) HTML
b) Python and SQL
c) Photoshop
d) JavaScript only
✔️ Answer: b

ETL process is usually scheduled to run:


a) In real-time
b) Once in 10 years
c) At regular intervals (daily, hourly, etc.)
d) Only manually
✔️ Answer: c

In a production system, ETL jobs are:


a) Written on paper
b) Run by users directly
c) Automated and monitored
d) Ignored
✔️ Answer: c
One key advantage of ETL is:
a) Manual data entry
b) Faster gaming speed
c) Consistent and clean data in the warehouse
d) More storage use
✔️ Answer: c

A major challenge in ETL is:


a) Writing emails
b) Data inconsistency or loss
c) Playing games
d) Website design
✔️ Answer: b

You might also like