[go: up one dir, main page]

0% found this document useful (0 votes)
9 views2 pages

Data Science QA

The document provides an overview of key concepts in data analysis, including data preprocessing, visualization, and various statistical methods. It explains the differences between graphs, the importance of correlation in AI, and outlines machine learning methods and their applications. Additionally, it covers regression analysis, clustering techniques, and the steps involved in specific algorithms.

Uploaded by

hansika Makkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views2 pages

Data Science QA

The document provides an overview of key concepts in data analysis, including data preprocessing, visualization, and various statistical methods. It explains the differences between graphs, the importance of correlation in AI, and outlines machine learning methods and their applications. Additionally, it covers regression analysis, clustering techniques, and the steps involved in specific algorithms.

Uploaded by

hansika Makkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

1. What is data preprocessing?

Data preprocessing is the process of cleaning and transforming raw data into a format suitable for
analysis or machine learning models. It includes handling missing values, removing duplicates,
normalization, feature extraction, and encoding categorical variables.

2. What is data visualization and why is it important?


Data visualization is the graphical representation of information and data using charts, graphs, and
plots. Importance: - Makes complex data easier to understand. - Helps identify patterns, trends, and
outliers. - Improves decision-making by presenting insights clearly.

3. How does a line graph differ from a bar graph?


- Line Graph: Uses points connected by lines; best for showing trends over time. - Bar Graph: Uses
rectangular bars; best for comparing quantities across categories.

4. When would you use a scatter plot?


Scatter plots are used to show the relationship between two continuous variables. For example,
height vs. weight or study time vs. exam score.

5. If a matrix has 6 elements, what are the possible orders it can have?
Possible orders (rows × columns) with 6 elements: - 1×6, 6×1, 2×3, 3×2.

6. Construct a 3×2 matrix where each element is given by aij = i × j.


A = [[1,2],[2,4],[3,6]]

7. Find the transpose of the matrix B = [5 -1 4; 2 3 6]


B^T = [[5,2],[-1,3],[4,6]]

8. Advantages and limitations of pie charts:


Advantages: - Easy to understand. - Good for showing proportions. Limitations: - Not effective for
large datasets. - Difficult to compare small slices. Example: Useful for showing percentage of
students in Science, Commerce, Arts; not useful for showing sales distribution for 20+ products.

9. Explain mean, median and mode.


- Mean: Average of numbers. - Median: Middle value when data is ordered. - Mode: Most frequent
value.

10. Four levels of measurement:


1. Nominal: Categories without order. 2. Ordinal: Ordered categories. 3. Interval: Equal intervals but
no true zero. 4. Ratio: Numeric scale with true zero.

11. Given matrices A and B, calculate A + B and B – A.


Matrices not provided, so calculation cannot be completed.

12. What is Machine Learning? Name the three methods.


Machine learning is a subset of AI where systems learn from data to make predictions or decisions
without explicit programming. Methods: 1. Supervised Learning 2. Unsupervised Learning 3.
Reinforcement Learning

13. How are correlation measures used in AI applications?


Correlation measures show the strength and direction of relationships between variables. - Helps in
feature selection. - Identifies redundant variables. - Useful in recommendation systems.

14. Examples of regression algorithms:


Linear Regression, Polynomial Regression, Ridge Regression, Lasso Regression, Logistic
Regression (for classification).

15. What are regression algorithms used for?


Regression algorithms are used to predict continuous values such as house prices, stock prices, or
temperature.

16. What is Linear Regression? Applications:


Linear regression models relationship between dependent and independent variables using a
straight line. Applications: - Predicting housing prices. - Predicting sales based on advertising
spend.

17. How can outliers impact regression analysis?


Outliers can distort the regression line, leading to biased predictions and misleading results.

18. Primary difference between classification and regression:


- Classification: Predicts categories. - Regression: Predicts continuous values.

19. Common applications of clustering techniques:


- Customer segmentation - Document classification - Image segmentation - Anomaly detection

20. Types of clustering methods:


- Partitioning methods (k-means) - Hierarchical clustering - Density-based clustering (DBSCAN) -
Grid-based clustering

21. How does a classification model work?


It learns from labeled training data, finds patterns to separate classes, and predicts the class of new
data.

22. Two advantages and disadvantages of linear regression:


Advantages: - Simple and easy to interpret. - Works well with linear relationships. Disadvantages: -
Sensitive to outliers. - Cannot capture complex relationships.

23. Steps involved in k-NN algorithm:


1. Choose value of k. 2. Calculate distances between test point and training points. 3. Select k
nearest neighbors. 4. Assign class based on majority vote or average.

24. Steps involved in k-means clustering:


1. Choose number of clusters k. 2. Initialize centroids. 3. Assign points to nearest centroid. 4.
Recalculate centroids. 5. Repeat until stable.

25. What do you mean by a capstone project?


A capstone project is a final, practical project to apply knowledge gained in a course. Example:
predicting house prices using regression models.

You might also like