VTU Machine Learning – Module 1 (Chapter 1) Answers
Unit: Introduction to Machine Learning
1. Why is machine learning needed for business organizations?
Machine Learning helps business organizations by:
- Automating decision-making processes
- Discovering insights from large datasets
- Personalizing customer experience (e.g., recommendation systems)
- Predicting market trends and customer behavior
- Detecting fraud and anomalies in real-time
- Enhancing operational efficiency and reducing manual intervention
2. List out the factors that drive the popularity of machine learning.
- Availability of big data
- Advancements in computational power
- Open-source libraries and frameworks (e.g., TensorFlow, Scikit-learn)
- Success in real-world applications (e.g., Siri, Netflix recommendations)
- Need for data-driven decision making
- Advances in algorithms and models
3. What is a model?
A model in machine learning is a mathematical representation of a real-world process,
learned from data.
It maps inputs to outputs and is used to make predictions or decisions.
Example:
In a spam email classifier, the model learns from labeled emails and predicts whether a new
email is spam or not.
4. Distinguish between the terms: Data, Information, Knowledge, and
Intelligence.
| Term | Description |
|--------------|-------------|
| Data | Raw facts and figures (e.g., numbers, text) |
| Information | Processed data that is meaningful (e.g., sales of ₹10,000 this week) |
| Knowledge | Understanding derived from information (e.g., higher sales on weekends) |
| Intelligence | Ability to apply knowledge for decision-making and actions (e.g., run ads on
weekends) |
5. How is machine learning linked to AI, Data Science, and Statistics?
- AI: ML is a subset of AI that enables systems to learn from data.
- Data Science: ML provides predictive modeling techniques in data analysis workflows.
- Statistics: ML borrows heavily from statistical theory for data modeling and inference.
6. List out the types of machine learning.
1. Supervised Learning
2. Unsupervised Learning
3. Semi-supervised Learning
4. Reinforcement Learning
7. List out the differences between a model and a pattern.
| Aspect | Model | Pattern |
|--------|-------|---------|
| Scope | Global to dataset | Local to data |
| Definition | Mathematical abstraction that generalizes | Repeated data structures or trends
|
| Example | Decision tree trained on full dataset | A specific cluster of similar data points |
8. Are classification and clustering same or different? Justify.
Different.
| Aspect | Classification | Clustering |
|---------|----------------|------------|
| Type | Supervised Learning | Unsupervised Learning |
| Output | Predicts pre-defined labels | Groups data based on similarity |
| Example | Email spam detection | Customer segmentation |
9. List out the differences between labeled and unlabeled data.
| Type | Labeled Data | Unlabeled Data |
|-----------|--------------|----------------|
| Definition| Data with input-output pairs | Data with only inputs |
| Use in ML | Supervised Learning | Unsupervised Learning |
| Example | Emails tagged as spam/not spam | Unclustered customer reviews |
10. Point out the differences between supervised and unsupervised learning.
| Aspect | Supervised Learning | Unsupervised Learning |
|-----------|---------------------|------------------------|
| Data Type | Labeled | Unlabeled |
| Goal | Predict output | Discover structure |
| Example | Classification, Regression | Clustering, Association |
11. What are the differences between classification and regression?
| Aspect | Classification | Regression |
|--------------|----------------|------------|
| Output Type | Categorical | Continuous |
| Example | Spam detection | House price prediction |
| Algorithms | SVM, Decision Tree | Linear Regression, SVR |
12. What is semi-supervised learning?
Semi-supervised learning is a machine learning technique that uses a small amount of
labeled data and a large amount of unlabeled data for training.
It bridges the gap between supervised and unsupervised learning.
13. List out the differences between reinforced learning and supervised
learning.
| Aspect | Reinforcement Learning | Supervised Learning |
|-----------------|------------------------|---------------------|
| Feedback Type | Delayed (reward/penalty) | Immediate (correct label) |
| Goal | Learn optimal action sequence | Learn mapping from inputs to outputs |
| Environment | Dynamic (agent interacts) | Static dataset |
| Example | Game playing bots | Email classification |
14. List out important classification and clustering algorithms.
Classification Algorithms:
- Decision Tree
- Support Vector Machine (SVM)
- Naive Bayes
- K-Nearest Neighbors (KNN)
- Logistic Regression
Clustering Algorithms:
- K-Means
- Hierarchical Clustering
- DBSCAN
- Gaussian Mixture Models
15. List out at least five major applications of machine learning.
1. Email spam filtering
2. Fraud detection in banking
3. Image recognition and classification
4. Recommendation systems (Netflix, Amazon)
5. Speech recognition (Google Assistant, Siri)