0% found this document useful (0 votes)

14 views13 pages

FPA Notes

The document provides an overview of various classification algorithms including K-Nearest Neighbors (KNN), Naive Bayes, Support Vector Machines (SVM), and Decision Trees, detailing their core concepts, working mechanisms, and appropriate use cases. It emphasizes the importance of understanding data characteristics, computational resources, and algorithm limitations when selecting a classification method. Additionally, it outlines the steps to build a decision tree classifier and discusses the application of these algorithms in business contexts, along with metrics for assessing model performance.

Uploaded by

Deepthi Spandan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views13 pages

FPA Notes

Uploaded by

Deepthi Spandan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

CHAPTER 2

1Q. What is KNN?How to determine core aspects of classification in order to

understand when its an appropriate technique?

1. What is KNN?

K-Nearest Neighbors (KNN) is a non-parametric, supervised learning algorithm used for both
classification and regression tasks.

● Core Concept: KNN operates on the principle of similarity. It classifies new data
points based on the majority class of its 'k' nearest neighbors in the training dataset.
● How it Works:
1. Calculate Distances: For a given new data point, calculate the distance to all
existing data points in the training set. Common distance metrics include
Euclidean distance, Manhattan distance, and Minkowski distance.
2. Find Nearest Neighbors: Identify the 'k' data points with the shortest
distances to the new data point.
3. Classify:
■ Classification: Assign the new data point to the class that is most
frequent among its 'k' nearest neighbors (majority voting).
■ Regression: Predict the value of the new data point by averaging the
values of its 'k' nearest neighbors.

2. Determining Appropriateness of KNN

KNN's suitability depends on several key aspects of your classification problem:

● Data Characteristics:

○ Data Type: KNN excels with numerical data. Categorical data might require
encoding (e.g., one-hot encoding).
○ Data Distribution: KNN doesn't make strong assumptions about data
distribution. It can work well with non-linear relationships.
○ Data Size: KNN can be computationally expensive for large datasets due to
the need to calculate distances to all training points.
○ Data Dimensionality: High-dimensional data can lead to the "curse of
dimensionality," where distances between points become less meaningful.
Techniques like dimensionality reduction (PCA) might be necessary.
● Problem Characteristics:

○Class Boundaries: KNN is well-suited for problems where class boundaries

are complex or non-linear.
○ Interpretability: If understanding the decision-making process is crucial,
KNN can be less interpretable than other models.
○ Real-time Predictions: KNN can be slow for real-time predictions with large
datasets, as it needs to compare the new data point to all training points.
● Computational Resources:

○ Computational Power: KNN can be computationally expensive, especially

with large datasets. Efficient data structures (e.g., k-d trees) can help speed
up distance calculations.
○ Memory: KNN requires storing the entire training dataset in memory.

Key Considerations:
● Choosing the Value of 'k':
○ A small 'k' can be sensitive to noise, leading to overfitting.
○ A large 'k' can smooth out the decision boundary too much, leading to
underfitting.
○ Cross-validation techniques (e.g., k-fold cross-validation) can help determine
the optimal 'k' value.
● Distance Metric: The choice of distance metric can significantly impact performance.
Experiment with different metrics (Euclidean, Manhattan, etc.) to find the best one for
your data.
● Data Preprocessing:
○ Scaling: Scaling features (e.g., using standardization or normalization) is
crucial to prevent features with larger scales from dominating distance
calculations.
○ Handling Missing Values: Imputation methods (e.g., mean imputation, k-
nearest neighbor imputation) can be used to handle missing values.

In Summary:

KNN is a versatile algorithm with strengths in handling non-linear relationships and adapting
to new data. However, it's crucial to carefully consider the characteristics of your data and
the computational resources available before choosing KNN. By understanding these core
aspects and addressing potential challenges, you can effectively apply KNN to a wide range
of classification problems.

2Q. What is classification Naive Bayes? How to Identify Naive Bayes classification
and when it applicable.

Naive Bayes is a probabilistic machine learning algorithm based on Bayes' Theorem with the
"naive" assumption of independence between features. It's a simple yet surprisingly effective
method for classification tasks.

Key Concepts:

● Bayes' Theorem:

○ Provides a way to calculate the probability of a class (e.g., "spam" or "not

spam") given the observed features (e.g., the presence of certain words in an
email).
○ Formally: P(Class | Features) = (P(Features | Class) * P(Class)) / P(Features)
● Naive Assumption:
○ The "naive" part of the name comes from the simplifying assumption that all
features are independent of each other given the class. This means that the
algorithm assumes that the presence or absence of one feature in a class
does not influence the presence or absence of any other feature.
● Classification:
○ Naive Bayes calculates the probability of each class given the observed
features.
○ The class with the highest probability is assigned to the new data point.

Types of Naive Bayes:

● Gaussian Naive Bayes: Assumes that features are continuous and normally
distributed within each class.
● Multinomial Naive Bayes: Suitable for discrete features, often used for text
classification (e.g., document categorization, spam filtering).
● Bernoulli Naive Bayes: Designed for binary features (e.g., presence or absence of
a word in a document).

Identifying Naive Bayes Applicability

Naive Bayes is generally a good choice when:

● Data is high-dimensional: It can handle many features efficiently due to its

simplicity.
● Data is sparse: It can work well with sparse datasets, such as text data where many
features have zero values.
● Speed is crucial: Naive Bayes is computationally fast for both training and
prediction.
● Feature independence assumption holds (approximately): While the
independence assumption is often violated in real-world scenarios, Naive Bayes can
still perform surprisingly well even with moderate feature dependencies.

When Naive Bayes Might Not Be the Best Choice:

● Strong feature dependencies: If features are highly correlated, the independence

assumption can significantly degrade performance.
● Zero-frequency problem: If a combination of feature values and a class never
occurs in the training data, the probability estimate for that combination will be zero,
leading to inaccurate predictions. Techniques like Laplace smoothing can help
mitigate this issue.

In Summary

Naive Bayes is a simple, fast, and surprisingly effective classification algorithm that can be a
valuable tool in many machine learning applications, especially when dealing with high-
dimensional data and text-based problems.

However, it's important to be aware of its limitations, particularly the assumption of feature
independence, and choose it wisely based on the characteristics of your data.

3Q. What is classification support vector machine? How to identify the baics of SVM
Classification algorithm?

3Q. What is Classification Support Vector Machine? How to Identify the Basics of SVM
Classification Algorithm?

Support Vector Machine (SVM) for Classification

SVM is a powerful supervised machine learning algorithm primarily used for classification
tasks. It aims to find the optimal hyperplane that best separates data points of different
classes.

Core Concepts:

● Hyperplane: In a two-dimensional space, the hyperplane is a line. In higher

dimensions, it's a plane or a more complex surface. This hyperplane serves as the
decision boundary to classify new data points.
● Margin: The distance between the hyperplane and the nearest data points of each
class.
● Support Vectors: The data points that lie closest to the hyperplane. These points
are crucial in determining the position and orientation of the hyperplane.

Key Principles:

● Maximize Margin: SVM seeks to find the hyperplane that maximizes the margin
between the two classes. This leads to better generalization and improved
performance on unseen data.
● Kernel Trick: SVMs can handle non-linearly separable data by using kernel
functions. These functions implicitly map the data into a higher-dimensional space
where linear separation becomes possible. Common kernels include:
○ Linear Kernel
○ Polynomial Kernel
● Regularization: SVM incorporates a regularization parameter (often denoted as 'C')
Identifying the Basics of SVM Classification:
1. Data: SVM can be applied to various data types, but it's particularly effective with
numerical data.
2. Linear Separability: Determine if the data is linearly separable or requires non-linear
transformations (kernel trick).
3. Support Vectors: Identify the data points closest to the decision boundary. These
points play a crucial role in defining the hyperplane.
4. Margin: Visualize or calculate the margin between the hyperplane and the support
vectors. A wider margin generally indicates better generalization.
5. Kernel Selection: Choose an appropriate kernel function based on the data
characteristics.
6. Regularization Parameter (C): Tune the 'C' parameter to find the optimal balance
between margin maximization and misclassification penalties.

When SVM Might Be a Good Choice:

● High-dimensional data: SVM can effectively handle data with many features.
● Non-linearly separable data: Kernel functions enable SVM to address complex
relationships.
● Small datasets: SVM can perform well with limited data due to its focus on support
vectors.
● Classification problems: SVM is primarily used for classification tasks, but it can
also be adapted for regression.

In Summary

SVM is a powerful and versatile classification algorithm that excels in finding optimal
decision boundaries. By understanding the core concepts of hyperplanes, margins, support
vectors, and kernel functions, you can effectively apply SVM to a wide range of classification
problems.

4Q. What are uses of classification Support vector algorithm?

Uses of Classification Support Vector Machine (SVM) Algorithm

SVM, with its ability to handle high-dimensional data, find optimal decision boundaries, and
effectively address non-linearity, finds widespread application across diverse domains:

1. Text Classification:
● Sentiment Analysis: Classifying text as positive, negative, or neutral. This is crucial
for social media monitoring, customer feedback analysis, and market research.
● Spam Detection: Identifying spam emails, messages, or comments, improving email
security and online experience.
● Document Categorization: Organizing documents into relevant categories (e.g.,
news articles, scientific papers) for easier search and retrieval.

2. Image Recognition:

● Object Detection: Identifying and locating objects within images (e.g., faces, cars,
pedestrians) in applications like self-driving cars and surveillance systems.
● Image Classification: Categorizing images based on their content (e.g., animals,
landscapes, objects).
● Medical Imaging: Analyzing medical images (X-rays, MRI scans) for disease
detection and diagnosis.

3. Bioinformatics:

● Protein Classification: Classifying proteins based on their structure or function.

● Gene Expression Analysis: Predicting gene function and identifying disease-related
genes.
● Drug Discovery: Identifying potential drug targets and predicting drug-protein
interactions.

4. Face Recognition:

● Authentication: Identifying individuals based on their facial features for security and
access control.
● Emotion Recognition: Detecting and classifying human emotions from facial
expressions.

5. Anomaly Detection:

● Fraud Detection: Identifying fraudulent transactions in finance and e-commerce.

● Network Intrusion Detection: Detecting malicious activity in computer networks.

6. Geographic Information Systems (GIS):

● Land Cover Classification: Classifying land cover types (e.g., forest, water, urban
areas) from satellite imagery.
● Environmental Monitoring: Analyzing environmental data to monitor changes in
climate, pollution levels, and natural resources.

7. Financial Applications:

● Credit Scoring: Assessing credit risk for loan applications.

● Stock Market Prediction: Predicting stock prices and market trends (though with
caution due to the inherent complexity of financial markets).

5Q. What is classification Decision Trees? How to identify the steps to build a
decision tree classifier. Apply these steps to create a basic decision tree.

What is Classification Decision Trees?

A decision tree is a supervised machine learning algorithm used for both classification and
regression tasks. In the context of classification, it creates a model that predicts the class of
a data point based on a series of if-then-else questions.

Key Concepts:

● Tree Structure: The model resembles a tree-like structure with nodes and branches.
● Nodes: Represent features or attributes of the data.
● Branches: Represent possible values or ranges of values for the features.
● Leaves: Terminal nodes that represent the predicted class labels.

How Decision Trees Work:

1. Start at the Root Node: The tree begins with the root node, which contains the
entire dataset.

2. Feature Selection: The algorithm selects the best feature to split the data at each
node. The "best" feature is typically determined by a metric like:

○ Information Gain: Measures how much information a feature provides about

the class labels.
○ Gini Impurity: Measures the impurity of a node (a node is pure if all data
points in it belong to the same class).
3. Splitting: The data is split into subsets based on the selected feature's values.

4. Recursion: The process is repeated recursively on each subset until a stopping

criterion is met (e.g., all data points in a subset belong to the same class, or a
maximum depth is reached).

Steps to Build a Decision Tree Classifier:

1. Data Preparation:

○ Data Collection: Gather a labeled dataset with features and corresponding

class labels.
○ Data Cleaning: Handle missing values, outliers, and inconsistencies.
○ Data Preprocessing: Transform data (e.g., scaling, encoding categorical
variables).
2. Feature Selection: Select the best feature to split the data at each node using a
metric like Information Gain or Gini Impurity.

3. Tree Construction: Recursively split the data based on the selected features until a
stopping criterion is met.

4. Pruning (Optional): Reduce the size of the tree to prevent overfitting. This can be
done by removing branches that do not significantly improve performance.

5. Evaluation: Evaluate the performance of the decision tree using metrics like
accuracy, precision, recall, and F1-score.

6Q. What is the use a decision tree algorithm and appropriate metrics to solve a
business problem and assess the quality of the solution.

Uses of Decision Tree Algorithms in Business

Decision trees find numerous applications in various business domains:

● Customer Churn Prediction:

○ Identify customers likely to discontinue their service or subscription.
○ Allows proactive measures to retain valuable customers.
● Marketing Campaign Targeting:
○ Segment customers into groups with similar characteristics.
○ Tailor marketing campaigns to specific customer segments for better ROI.
● Fraud Detection:
○ Detect fraudulent transactions in credit card usage, insurance claims, or
online activities.
○ Minimize financial losses and improve security.
● Risk Assessment:
○ Assess credit risk for loan applications.
○ Evaluate investment risks in financial markets.
● Product Recommendation:
○ Recommend products or services to customers based on their purchase
history and preferences.
● Customer Segmentation:
○ Divide customers into distinct groups based on demographics, behavior, and
other relevant factors.
● Supply Chain Optimization:
○ Optimize inventory management and logistics by predicting demand and
identifying potential disruptions.
● Decision Support Systems:
○ Assist in making informed decisions in various business areas, such as
operations, finance, and human resources.

Appropriate Metrics to Assess the Quality of the Solution

● Accuracy:
○ The proportion of correctly classified instances.
○ A general measure of model performance, but can be misleading in
imbalanced datasets.
● Precision:
○ The proportion of true positive predictions among all positive predictions.
○ Measures the model's ability to avoid false positives.
● Recall (Sensitivity):
○ The proportion of true positive predictions among all actual positive instances.
○ Measures the model's ability to identify all positive cases.
● F1-score:
○ The harmonic mean of precision and recall.
○ Provides a balanced measure of both precision and recall.
● AUC (Area Under the ROC Curve):
○ Measures the model's ability to distinguish between classes across different
thresholds.
○ A higher AUC indicates better performance.
● Confusion Matrix:
○ Provides a detailed breakdown of true positives, true negatives, false
positives, and false negatives.
○ Helps visualize and understand the model's performance in more detail.

Choosing the Right Metrics

The choice of metrics depends on the specific business problem and the desired outcomes.
For example:

● In fraud detection, high precision is crucial to minimize false alarms and avoid
unnecessary investigations.
● In medical diagnosis, high recall is important to ensure that all cases of the disease
are identified.

CHAPTER 3

1Q. What is clustering?How to determine core aspects and types of clustering inorder
to properly apply the algorithm to business problems.

. What is Clustering?

Clustering is an unsupervised machine learning technique that groups similar data points
together based on their inherent characteristics. Unlike supervised learning (like
classification), where data is labeled, clustering aims to discover underlying patterns and
structures within unlabeled data.

Core Aspects of Clustering:

● Similarity Measure:
○ How do you define "similarity" between data points?
○ Common measures include Euclidean distance, Manhattan distance, cosine
similarity, and correlation.
● Number of Clusters:
○ How many clusters should the data be divided into?
○ Determining the optimal number of clusters can be challenging and often
involves techniques like the elbow method or silhouette analysis.
● Cluster Shapes:
○ Some algorithms assume clusters have specific shapes (e.g., spherical in K-
means).
○ Choosing the right algorithm depends on the expected shape of clusters in
your data.
● Noise and Outliers:
○ How do you handle data points that don't clearly belong to any cluster (noise)
or are far from any cluster center (outliers)?

Types of Clustering Algorithms

1. Partitioning:
○ K-means: Partitions data into K clusters by minimizing the within-cluster sum
of squares.
○ K-medoids: Similar to K-means, but uses data points as cluster centers
instead of the mean.
2. Hierarchical:
○ Agglomerative: Starts with each data point as a separate cluster and
iteratively merges the closest pairs of clusters.
○ Divisive: Starts with all data points in one cluster and iteratively splits the
cluster into smaller ones.
3. Density-Based:
○ DBSCAN (Density-Based Spatial Clustering of Applications with Noise):
Identifies clusters based on the density of data points in a region.
○ OPTICS (Ordering Points To Identify the Clustering Structure): Similar to
DBSCAN, but creates an ordered representation of the data that can be used
to find clusters at different density thresholds.
4. Distribution-Based:
○ Gaussian Mixture Models (GMM): Assumes that data points are generated
from a mixture of Gaussian distributions.

Applying Clustering to Business Problems

1. Customer Segmentation: Group customers based on demographics, purchase

history, and behavior to tailor marketing campaigns.
2. Image Segmentation: Group pixels in an image based on color, texture, or other
visual features.
3. Anomaly Detection: Identify unusual patterns or outliers in data, such as fraudulent
transactions or network intrusions.
4. Recommendation Systems: Group users with similar preferences to provide
personalized recommendations.
5. Document Clustering: Group similar documents (e.g., news articles, research
papers) together for better organization and information retrieval.

Key Considerations:

● Data Preprocessing:
○ Clean the data (handle missing values, outliers).
○ Scale or normalize features to ensure all features contribute equally to the
distance calculations.
● Choosing the Right Algorithm:
○ Consider the shape of the clusters, the number of clusters, and the size of the
dataset.
● Evaluating Clustering Results:
○ Use metrics like silhouette score, Davies-Bouldin index, and visual inspection
to assess the quality of the clustering.

2Q. Explain about apply various clustering Algorithms to data sets inorder to solve
common applicable business problems.

different clustering algorithms can be applied to solve common business problems:

1. Customer Segmentation

● Problem: Divide customers into distinct groups with similar characteristics to tailor
marketing campaigns, offer personalized experiences, and improve customer
satisfaction.
● Algorithms:
○ K-means: Effective for grouping customers based on demographics (age,
income, location), purchase history (frequency, recency, monetary value), and
browsing behavior.
○ Hierarchical Clustering: Can reveal hierarchical relationships between
customer segments, such as identifying broad customer groups and then
further subdividing them into more specific segments.
○ DBSCAN: Useful for identifying clusters of customers with similar purchasing
patterns, even if those clusters have irregular shapes.

2. Product Recommendation

● Problem: Recommend relevant products or services to customers based on their

preferences and past behavior.
● Algorithms:

○ K-means: Group customers with similar purchase histories into clusters.

Recommend products frequently purchased by other customers in the same
cluster.
○ Collaborative Filtering: While not strictly clustering, it leverages similarities
between users and items to make recommendations.

3. Anomaly Detection

● Problem: Identify unusual or suspicious activities, such as fraudulent transactions,

network intrusions, or equipment malfunctions.
● Algorithms:

○ DBSCAN: Can effectively identify outliers (anomalies) as data points that lie
in low-density regions.
○ Isolation Forest: An anomaly detection algorithm that isolates anomalies by
randomly selecting features and partitioning the data.

4. Image Segmentation

● Problem: Divide an image into distinct regions or objects.

● Algorithms:
○ K-means: Can be used to segment images based on pixel color or other
visual features.
○ Mean Shift: A non-parametric clustering algorithm that can effectively
segment images with complex shapes and varying densities.

5. Text Document Clustering

● Problem: Group similar documents (e.g., news articles, research papers) together
for better organization, information retrieval, and topic discovery.
● Algorithms:

○ K-means: Can be used to cluster documents based on their word

frequencies or other textual features.
○ Hierarchical Clustering: Can reveal hierarchical relationships between
documents, such as identifying broad topics and then subtopics.

Key Considerations When Applying Clustering Algorithms:

● Data Preprocessing:
○ Clean and prepare the data (handle missing values, outliers, etc.)
○ Scale or normalize features to ensure all features contribute equally to the
distance calculations.
● Choosing the Right Algorithm:
○ Consider the shape of the clusters, the number of clusters, and the size of the
dataset.
● Determining the Number of Clusters:
○ Use techniques like the elbow method, silhouette analysis, or domain
knowledge to determine the optimal number of clusters.
● Evaluating Clustering Results:
○ Use appropriate metrics (e.g., silhouette score, Davies-Bouldin index) to
assess the quality of the clustering.
● Interpretation and Visualization:
○ Visualize the clusters using techniques like scatter plots, dendrograms, or t-
SNE to gain insights and communicate the results effectively.

CHAPTER 4

OPTIMIZATION

1Q. What is optimization? Explain the goals and constraints of a linear optimization.

Optimization is a mathematical process of finding the best possible solution to a problem

given certain constraints. It involves identifying the values of decision variables that either
maximize or minimize an objective function while adhering to a set of limitations.

Goals of Linear Optimization:

● Maximization:
○ Increase profit: Determine production levels to maximize profit given resource
constraints (labor, materials, etc.).
○ Maximize revenue: Find the pricing strategy that generates the highest
revenue for a given product or service.
○ Maximize market share: Develop marketing strategies to reach the largest
possible customer base.
● Minimization:
○ Minimize costs: Reduce production costs by optimizing resource allocation
and minimizing waste.
○ Minimize risk: Minimize investment risk in financial portfolios.
○ Minimize travel time: Find the shortest or most efficient routes for
transportation and logistics.

Constraints of Linear Optimization:

● Resource Constraints: Limitations on available resources such as raw materials,

labor, machinery, and budget.
● Demand Constraints: Limitations on the demand for products or services.
● Capacity Constraints: Limitations on production capacity or storage space.
● Time Constraints: Limitations on the time available for production, delivery, or other
activities.
● Regulatory Constraints: Legal or regulatory requirements that must be met.
● Non-negativity Constraints: Restrictions that ensure decision variables cannot take
negative values.

Key Characteristics of Linear Optimization:

● Linearity: The objective function and all constraints must be linear functions of the
decision variables. This means that the relationship between variables is
proportional.
● Deterministic: Assumes that all parameters and coefficients are known with
certainty.
● Static: Assumes that the problem conditions remain constant over the decision-
making period.

Applications of Linear Optimization:

● Business: Production planning, portfolio optimization, transportation logistics, supply

chain management.
● Engineering: Structural design, network optimization, resource allocation.
● Finance: Portfolio optimization, risk management, investment planning.
● Operations Research: Scheduling, inventory control, project management.

Linear optimization provides a powerful framework for making optimal decisions in a wide
range of applications where the objective and constraints can be expressed as linear
functions. By carefully defining the objective, identifying relevant constraints, and applying
appropriate optimization techniques, businesses and organizations can make informed
decisions that lead to improved efficiency, profitability, and overall performance.

2Q. How to calculate a linear optimization inorder to solve a business problem.

1. Define the Problem

● Identify Decision Variables: Determine the key factors that can be controlled or
adjusted to achieve the desired outcome. These become decision variables.
● Formulate the Objective Function:
○ Express the goal of the optimization problem as a mathematical equation.
■ Maximization: For example, "Maximize profit = (price per unit of
product A * number of units of product A) + (price per unit of product B
* number of units of product B)"
■ Minimization: For example, "Minimize cost = (cost per unit of
resource 1 * amount of resource 1) + (cost per unit of resource 2 *
amount of resource 2)"
● Identify Constraints: Determine the limitations or restrictions that must be
considered. These are expressed as inequalities or equations.
○ Resource Constraints
○ Demand Constraints: "Number of units of product A produced ≥
minimum demand for product A"
○ Capacity Constraints: "Production capacity of machine X ≤ maximum
production capacity of machine X"

2. Choose a Solution Method

● Graphical Method: Suitable for problems with two decision variables. Visualize the
constraints as lines on a graph and identify the feasible region (the area that satisfies
all constraints). The optimal solution lies at a corner point of this region.
● Simplex Method: An iterative algorithm for solving linear programming problems
with more than two decision variables. It systematically explores the feasible region
to find the optimal solution.

3. Solve the Problem

● Apply the chosen method: Follow the steps of the chosen method to determine the
values of the decision variables that optimize the objective function while satisfying
all constraints.
● Interpret the Solution: Analyze the results and determine the optimal course of
action.

Example: Production Planning

A company produces two products, A and B.

● Decision Variables:
○ x: Number of units of product A to produce
○ y: Number of units of product B to produce
● Objective Function:
○ Maximize Profit: P = 10x + 15y (assuming profit per unit of A is $10 and B is
$15)
● Constraints:
○ Resource 1: 2x + y ≤ 100 (resource 1 constraint)
○ Resource 2: x + 3y ≤ 120 (resource 2 constraint)
○ Non-negativity: x ≥ 0, y ≥ 0

Solution:

1. Graph the constraints: Plot the lines representing the constraints on a graph.
2. Identify the feasible region: The region that satisfies all constraints.
3. Find the corner points: Determine the coordinates of the vertices of the feasible
region.
4. Evaluate the objective function: Calculate the profit at each corner point.
5. Select the optimal solution: The corner point with the highest profit is the optimal
solution.

PPT9 Final Clubbed
No ratings yet
PPT9 Final Clubbed
12 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Unit 5-6
No ratings yet
Unit 5-6
18 pages
Big Data Notes
No ratings yet
Big Data Notes
33 pages
Unit 5
No ratings yet
Unit 5
28 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
Artificial Intelligence Lec 3
No ratings yet
Artificial Intelligence Lec 3
17 pages
Classification and Regression: Arturo Calder On Mora
No ratings yet
Classification and Regression: Arturo Calder On Mora
8 pages
ML 5
No ratings yet
ML 5
76 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
Machine Learning: Classification & Naive Bayes
No ratings yet
Machine Learning: Classification & Naive Bayes
20 pages
Aiml Chap 2
No ratings yet
Aiml Chap 2
9 pages
Module 3 Supervised ML Algo
No ratings yet
Module 3 Supervised ML Algo
48 pages
Unit 4
No ratings yet
Unit 4
26 pages
Bayesian
No ratings yet
Bayesian
23 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
17 pages
(Machine Learning) BAYES' THEOREM AND CONCEPT LEARNING
No ratings yet
(Machine Learning) BAYES' THEOREM AND CONCEPT LEARNING
22 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
18 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
Sensitivity Unit 4
No ratings yet
Sensitivity Unit 4
4 pages
Evaluation of Different Classifier
No ratings yet
Evaluation of Different Classifier
4 pages
UNIT - IV
No ratings yet
UNIT - IV
169 pages
CH 5
No ratings yet
CH 5
21 pages
FPA Unit 2
No ratings yet
FPA Unit 2
20 pages
Mla Unit-5'2
No ratings yet
Mla Unit-5'2
74 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Cse Vsem 503 B PR Unit 2 Notes
No ratings yet
Cse Vsem 503 B PR Unit 2 Notes
17 pages
Introduction to Classification in AI
No ratings yet
Introduction to Classification in AI
66 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Naive Bayes Classifier 1
No ratings yet
Naive Bayes Classifier 1
18 pages
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
Unit 2.2
No ratings yet
Unit 2.2
9 pages
Naïve Bayesian Classifier
No ratings yet
Naïve Bayesian Classifier
15 pages
Naive Bayes Classifier Notes
No ratings yet
Naive Bayes Classifier Notes
3 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Naive Bayes Etc.
No ratings yet
Naive Bayes Etc.
1 page
INT354 - Unit 3
No ratings yet
INT354 - Unit 3
60 pages
Classification
No ratings yet
Classification
7 pages
Naive Bayes Classifier in Machine Learning Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning Javatpoint
23 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
Naive Bayes & SVM Overview
No ratings yet
Naive Bayes & SVM Overview
79 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
28 pages
ML & Cloud Computing For Iot: Topics in Module-3
No ratings yet
ML & Cloud Computing For Iot: Topics in Module-3
38 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
37 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Lecture 07 Slides
No ratings yet
Lecture 07 Slides
45 pages
Naive Bayes - Report (Repaired)
No ratings yet
Naive Bayes - Report (Repaired)
5 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
NB Classifier & Bayesian Network 2
No ratings yet
NB Classifier & Bayesian Network 2
37 pages
Chapter 6: Classification and Prediction: Classify Predictions
No ratings yet
Chapter 6: Classification and Prediction: Classify Predictions
23 pages
U21amg05 Aif and ML Unit 04 Notes
No ratings yet
U21amg05 Aif and ML Unit 04 Notes
42 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
ML & Cloud Computing for IoT
No ratings yet
ML & Cloud Computing for IoT
149 pages
Fii Article
No ratings yet
Fii Article
2 pages
Notes CHP 2 Bs
No ratings yet
Notes CHP 2 Bs
12 pages
Notes Unit 1
No ratings yet
Notes Unit 1
56 pages
Unit 9 Marketing Management
No ratings yet
Unit 9 Marketing Management
203 pages
Importance of Business Values
No ratings yet
Importance of Business Values
20 pages
Ratio Analysis
No ratings yet
Ratio Analysis
17 pages
Exercises: Estimation and Detection (ET 4386)
No ratings yet
Exercises: Estimation and Detection (ET 4386)
10 pages
Class 11 QP - 14-12-23
No ratings yet
Class 11 QP - 14-12-23
4 pages
RAD: A Compile-Time Solution To Buffer Overflow Attacks
No ratings yet
RAD: A Compile-Time Solution To Buffer Overflow Attacks
8 pages
DevsecOps Part 3 Post Quiz - Attempt Review
100% (2)
DevsecOps Part 3 Post Quiz - Attempt Review
2 pages
Saurabh Mishra Mini Project 2 Report MBA 2nd Semester 2 1
No ratings yet
Saurabh Mishra Mini Project 2 Report MBA 2nd Semester 2 1
50 pages
Website Returns Form V2
No ratings yet
Website Returns Form V2
2 pages
System Health Check Training 19 March 2015: By: Aslaily Abd Manan
No ratings yet
System Health Check Training 19 March 2015: By: Aslaily Abd Manan
17 pages
2510G PDF
No ratings yet
2510G PDF
15 pages
Jefferson Frank Salary Survey 2018/19
No ratings yet
Jefferson Frank Salary Survey 2018/19
22 pages
StarHub Service
No ratings yet
StarHub Service
3 pages
Basic Ga
No ratings yet
Basic Ga
4 pages
Datasheet H4 HD Bullet
No ratings yet
Datasheet H4 HD Bullet
4 pages
Mechanical Engineer & CEO Profile
No ratings yet
Mechanical Engineer & CEO Profile
3 pages
E-Commerce Syllabus Overview
No ratings yet
E-Commerce Syllabus Overview
45 pages
Amirah Hamidi CV
No ratings yet
Amirah Hamidi CV
2 pages
Lecture#1 (Intro. To Numerical) NC
No ratings yet
Lecture#1 (Intro. To Numerical) NC
9 pages
465-Lecture 1 (Deep Learning)
No ratings yet
465-Lecture 1 (Deep Learning)
47 pages
WEEK 5 Still Life Drawing
100% (1)
WEEK 5 Still Life Drawing
22 pages
BPDB Annual Training 2025-26
No ratings yet
BPDB Annual Training 2025-26
41 pages
Fed Ex
No ratings yet
Fed Ex
17 pages
Log
No ratings yet
Log
2 pages
CC Brand Av-Specifications v3-0
No ratings yet
CC Brand Av-Specifications v3-0
18 pages
Mod 4 Up Down Counter
0% (1)
Mod 4 Up Down Counter
14 pages
Data Visualization Techniques Guide
0% (1)
Data Visualization Techniques Guide
9 pages
Lesson 1 Week 1 - Adobe Photoshop and Its Role in Society
No ratings yet
Lesson 1 Week 1 - Adobe Photoshop and Its Role in Society
17 pages
Leadership Principles Scenario Questions
No ratings yet
Leadership Principles Scenario Questions
2 pages
Xskak: An Extension To The Package Skak: Ulrike Fischer January 2, 2015
No ratings yet
Xskak: An Extension To The Package Skak: Ulrike Fischer January 2, 2015
59 pages
Eg MCQ
No ratings yet
Eg MCQ
5 pages
GDS101 Operation and Installation Manual: Graphic Depth Sounder
No ratings yet
GDS101 Operation and Installation Manual: Graphic Depth Sounder
73 pages

FPA Notes

Uploaded by

FPA Notes

Uploaded by

CHAPTER 2

1Q. What is KNN?How to determine core aspects of classification in order to

2. Determining Appropriateness of KNN

KNN's suitability depends on several key aspects of your classification problem:

○Class Boundaries: KNN is well-suited for problems where class boundaries

○ Computational Power: KNN can be computationally expensive, especially

○ Provides a way to calculate the probability of a class (e.g., "spam" or "not

Types of Naive Bayes:

Identifying Naive Bayes Applicability

Naive Bayes is generally a good choice when:

● Data is high-dimensional: It can handle many features efficiently due to its

When Naive Bayes Might Not Be the Best Choice:

● Strong feature dependencies: If features are highly correlated, the independence

Support Vector Machine (SVM) for Classification

● Hyperplane: In a two-dimensional space, the hyperplane is a line. In higher

When SVM Might Be a Good Choice:

4Q. What are uses of classification Support vector algorithm?

Uses of Classification Support Vector Machine (SVM) Algorithm

● Protein Classification: Classifying proteins based on their structure or function.

● Fraud Detection: Identifying fraudulent transactions in finance and e-commerce.

6. Geographic Information Systems (GIS):

● Credit Scoring: Assessing credit risk for loan applications.

What is Classification Decision Trees?

How Decision Trees Work:

○ Information Gain: Measures how much information a feature provides about

4. Recursion: The process is repeated recursively on each subset until a stopping

Steps to Build a Decision Tree Classifier:

○ Data Collection: Gather a labeled dataset with features and corresponding

Uses of Decision Tree Algorithms in Business

● Customer Churn Prediction:

Appropriate Metrics to Assess the Quality of the Solution

Choosing the Right Metrics

Core Aspects of Clustering:

Types of Clustering Algorithms

Applying Clustering to Business Problems

1. Customer Segmentation: Group customers based on demographics, purchase

different clustering algorithms can be applied to solve common business problems:

● Problem: Recommend relevant products or services to customers based on their

○ K-means: Group customers with similar purchase histories into clusters.

● Problem: Identify unusual or suspicious activities, such as fraudulent transactions,

● Problem: Divide an image into distinct regions or objects.

5. Text Document Clustering

○ K-means: Can be used to cluster documents based on their word

Key Considerations When Applying Clustering Algorithms:

Optimization is a mathematical process of finding the best possible solution to a problem

Goals of Linear Optimization:

Constraints of Linear Optimization:

● Resource Constraints: Limitations on available resources such as raw materials,

Key Characteristics of Linear Optimization:

Applications of Linear Optimization:

● Business: Production planning, portfolio optimization, transportation logistics, supply

2Q. How to calculate a linear optimization inorder to solve a business problem.

1. Define the Problem

2. Choose a Solution Method

3. Solve the Problem

Example: Production Planning

A company produces two products, A and B.

You might also like