[go: up one dir, main page]

0% found this document useful (0 votes)
11 views3 pages

Software Dev

The document outlines various aspects of data collection, preprocessing, and analysis, including multiple-choice questions, true/false statements, and fill-in-the-blank exercises. Key concepts covered include data cleaning, normalization, data integration, and handling missing data. Additionally, it emphasizes the importance of these processes in preparing data for effective analysis.

Uploaded by

mika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

Software Dev

The document outlines various aspects of data collection, preprocessing, and analysis, including multiple-choice questions, true/false statements, and fill-in-the-blank exercises. Key concepts covered include data cleaning, normalization, data integration, and handling missing data. Additionally, it emphasizes the importance of these processes in preparing data for effective analysis.

Uploaded by

mika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Prelims

Multiple Choices
1. What is the primary goal of data collection?
To gather raw information
2. Which of the following is a common method for data collection?
Surveys
3. What is the first step in data preprocessing?
Data cleaning
4. Which of the following is an example of structured data?
A spreadsheet
5. What does the term "missing data" refer to?
Data that is not available
6. Which of the following is a common technique used to handle missing data?
Data imputation
7. What is the purpose of data normalization?
To bring data into a common scale
8. Which of the following is a common method for data integration?
Data merging
9. What is the primary goal of data reduction?
To reduce the complexity of the dataset
10. Which of the following is an example of categorical data?
Gender
11. Which of the following techniques is used to handle outliers in data preprocessing?
Boxplot analysis
12. What is the purpose of feature scaling in data preprocessing?
To standardize the range of independent variables
13. Which of the following is a common method for data discretization?
Binning
14. What is the primary goal of data transformation?
To convert data into a suitable format for analysis
15. Which of the following is a common method for data cleaning?
Data imputation
16. What is the purpose of data integration in data preprocessing?
To combine data from multiple sources
17. Which of the following is a common method for handling categorical data in data
preprocessing?
One-hot encoding
18. What is the primary goal of data reduction in data preprocessing?
To reduce the complexity of the dataset
19. Which of the following is a common method for data reduction?
Principal Component Analysis (PCA)
20. What is the primary goal of data cleaning in data preprocessing?
To remove errors and inconsistencies from the dataset

True/False
1. Data collection is the final step in the data science process.
False
2. Structured data is always easier to analyze than unstructured data.
True
3. Data imputation is used to remove outliers from a dataset.
False
4. Normalization is a technique used to reduce the size of a dataset.
False
5. Data integration involves combining data from multiple sources into a single
dataset.
True
6. Principal Component Analysis (PCA) is used to increase the number of features in a
dataset.
False
7. Missing data can be ignored during the data preprocessing stage.
False
8. One-hot encoding is used to convert numerical data into categorical data.
False
9. Data cleaning is the first step in the data preprocessing stage.
True
10. Data discretization is used to convert continuous data into categorical data.
True

Fill-in-the-Blank
1. The process of converting raw data into a clean and usable format is called ....
data preprocessing
2. ... is a technique used to handle missing values by filling them with estimated values.
Data imputation
3. The goal of ... is to bring all features to a similar scale.
normalization
4. ... is a method used to combine data from multiple sources into a single dataset.
Data integration
5. ... is a technique used to reduce the dimensionality of a dataset while retaining most
of the original information.
Principal Component Analysis (PCA)
6. The process of converting categorical data into numerical data is called ....
one-hot encoding
7. ... is the process of identifying and correcting errors and inconsistencies in a
dataset.
Data cleaning
8. ... is a method used to convert continuous data into discrete intervals.
Data discretization
9. The process of reducing the complexity of a dataset is called ....
data reduction
10. ... is a technique used to identify and handle outliers in a dataset.
Boxplot analysis

Real-world scenario
Give a one (1) real-world scenario where you would need to perform data collection, data
preprocessing, and data analysis.

You might also like