[go: up one dir, main page]

0% found this document useful (0 votes)
6 views2 pages

FDS Module1 Simple Notes

Module 1 covers the foundations of Data Science, defining it as the extraction of insights from data using statistics, computer science, and domain knowledge. It explains key concepts such as AI, ML, DL, types of machine learning, and the importance of feature selection, model evaluation, and the bias-variance tradeoff. Additionally, it addresses challenges like overfitting, underfitting, and the curse of dimensionality.

Uploaded by

gopikasanil78
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views2 pages

FDS Module1 Simple Notes

Module 1 covers the foundations of Data Science, defining it as the extraction of insights from data using statistics, computer science, and domain knowledge. It explains key concepts such as AI, ML, DL, types of machine learning, and the importance of feature selection, model evaluation, and the bias-variance tradeoff. Additionally, it addresses challenges like overfitting, underfitting, and the curse of dimensionality.

Uploaded by

gopikasanil78
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Module 1 - Foundations of Data Science

(Simple Explanation for Exam)


1. What is Data Science?
Data Science is the process of extracting useful insights from data using a combination of
statistics, computer science, and domain knowledge. It helps answer questions like:
- What happened?
- Why did it happen?
- What will happen?
- What can be done next?

2. AI, ML, and DL


• Artificial Intelligence (AI): Systems that simulate human intelligence.
• Machine Learning (ML): A subset of AI where computers learn from data.
• Deep Learning (DL): A subset of ML that uses neural networks with many layers.

3. Types of Machine Learning


• Supervised Learning: Learns from labelled data (e.g., spam or not spam).
• Unsupervised Learning: Finds patterns in unlabelled data (e.g., grouping customers).

4. Classification vs Regression
• Classification: Predicts categories (e.g., cat or dog).
• Regression: Predicts continuous values (e.g., house price).

5. Feature Vector and Feature Selection


• Feature: Individual measurable property.
• Feature Vector: A list of features used to describe an object.
• Feature Selection: Choosing the best features to improve model accuracy and reduce
complexity.

6. Overfitting, Underfitting & Generalization


• Overfitting: Model memorizes training data (high variance).
• Underfitting: Model doesn’t learn enough from data (high bias).
• Generalization: Model performs well on new data.
7. Curse of Dimensionality
• Adding too many features can reduce model performance.
• Solution: Dimensionality Reduction using PCA, LDA etc.

8. Evaluation and Model Selection


• Confusion Matrix: Shows True/False Positives/Negatives.
• Accuracy = (TP + TN) / Total
• Precision = TP / (TP + FP)
• Recall = TP / (TP + FN)
• ROC Curve: Graph showing model performance across thresholds.

9. Bias-Variance Tradeoff
• Bias: Error from wrong assumptions.
• Variance: Error from sensitivity to small changes in training data.
• Goal: Low bias and low variance.

10. Training, Validation, Test Sets


• Training Set: Used to train the model.
• Validation Set: Used to tune hyperparameters.
• Test Set: Used to evaluate final model performance.

You might also like