Python Notes

Features of Python:
1. Python is intrepreted: Interpreted means we can execute line by line code of

python and the code is processed at run time by the interpreter.
2. Python is interactive: we can run the python prompt and interact with the
interpreter directly
3. Python is object oriented: In object oriented programming methodology we design
our functions keeping object in mind, this reduces the code complexity and gives
easyness in writing the code.
4. Python is a beginner's language: The python is known for its simplicity and
easyness to use
Variables in Python: The variables are used to store values and a variable is
dynamic in nature which means it can store a number or a string one after another
String in python
A string is a sequence of characters that can be used to express any meaning in any
specific language
eg.
str1 = "we are learning python"
print(str1)
Slicing
str1 = "Python Programming"
Split and Join

Split: The split is used to split the sentence into words and return the list of
words
syntax: str1 = "we are living, in india"
l = str1.split(' ')
List : The list is a data structure in python to store homgenous and hetrogeneous
data
join will use the join the word in the list according to the joining condition
syntax: " ".join(l)
List:
List provides a general meachanism to store the objects which have their specific
index by a number in python. The elements in the list are arbitary they can be
numbers string, functions, user defined objects.
Repetition:
The asterik * is overloaded for lists to serve as the repetition operator. the
result of applying repetition operator to a list is a single list with the element
of the original list repeated as many times as we specify
>>>l = [1,2,3]
>>>l*2
[1,2,3,1,2,3]
Dictionaries in Python
Dictionaries in python is a data structure which is unorderd, more generally we
also called it associative array. A dictionary consists of a collection of key-
valued pairs. Each key-value pair maps the key to its associated value.
syntax of dictionary:
we can define a dictionary by enclosing a comma seprated key-value pair; d =
{k1:v1, k2:v2, k3:v3, .....}
dict_roll_name = {'CS101':'Yogesh', 'CS102':'Raghav', 'CS141':'Sohan'}
>> type(dict_roll_name)
loops:
a while loop i sused to execute a block of code repeatedly until a given condition
goes false. Whwn the condition goes false then the libe after loop got executed
syntax:
while expression:
statements
for i in range(n):
statements
# Nested Looping
In nested looping we have more than one loop let us say we have two loops
the first one is called the outer loop, the second one is called inner loop
for i in range(3):
for j in range(2):
print(i,j)
Break, Continue and Pass
Quiz:
you have to build a guess and win game, where the owner will choose a random number
less than or equal to 12 and then he gives player 3 turns to
attempt, but also tell hin that how many of his turns are remaining, and also
prompt him that his guess his higher than the jackpot number or lower so that this
will help him in gusessing
Function in pythons:
Function is a resuable component which we can use again and again just by changing
the values.
Anonymous Function
the syntax of anonymous function is lambda [args] : expressions
The lambda function can have only one expression but it can have any number of
arguments.
# lambda function for the calculation of area of a rectangle
rectangle = lambda l,b : l*b
area = rectangle(14,6)
reduce function
The reduce function works also like lambda function but it operates in a different
way. this function takes two arguments perform operation according to lambda
function are store the result, in the next turn the reduce function will take gain
two arguments but this time it will take the previous result and the new argument
[1,2,3,4]==>[3,3,4]==>[6,4]==>[10]
[1,2,3,5]==>[2,3,5]==>[6,5]==>[30]
Nested Function
The nested function basically a function inside another function
syntax:
def outer_fun():
def inner_fun():
return " i am from inner function"
inner_fun()
return " i am from outer function"
outer_fun()
Exception Handling in python

Errors in python are of two types: 1. syntax error, 2 Exception
1. syntax error
2. type error
3. Name error
4. Index error
5. key error
6. value error==> function is called with invalid arguments
7. attribute error ==> this exception occurs when the attibute or method is not
found
9. I/O error : this exception occurs when we dont find the file we are looking for
10. Zero division error: the zero division error occurs when you divide a number by
zero
11. import error : this exception occurs when imported library is not found and we
have to insall it
Try, catch and finally blocks

syntax:
try:
# Statement
exept Index Error
# Statement
except Value Error
# Statement
finally:
# final_statement
Note: the finally block statement will always excute wether the exception is being
handled or not
# syntax:
try:
# some code
except:
# optional code
# handling of exception
else:
# the else block will execute if there is no exception
finally:
# The code of this block will always get executed
Q. Write a python program that prompts the user to input an integer and raises a
ValueError exception if the input is not a valid integer.
Q write a program which will enter two values and raises type error we try to add a
string with a integer
Object Oriented Programming using Python

Class: Python==> attributes (defines the property of the object)
Objects : students
Class use self keyword to denote the class variables
# The constructor function is a function which always get excuted when the class
will get initated
class Python:
def __init__(self, id, name, dob):
self.id = id
self.name = name
self.dob = dob
# instance creation/ object creation
s1 = Python("ID0125","Kiran",'1995-05-12')
print(s1.name)
print(s1.id)
print(s1.dob)
Inheritance in Python
The process of inheriting the properties of the parent class into a child class is
called inheritance
The existing class is called the Base Class or Parent Class, and the new class is
called derived class or Child Class.
Class BaseClass:
Body of the base class
Class DerivedClass:
Body of the derived class
Type of Inheritances:
1. Single Inheritance : In single inheritance a child class inherits from a single-
parent class. Here is one child class and one parent class.
2. Multiple Inheritance : In Multiple Inheritance one child class can inherit from
multiple parent classes. So we have one child class and multiple parent classes.
base class can acess the member functions of parent classes and acess them, apart
from its regular child class functions
3. Multilevel Inheritance
In multilevel inheritance, a class inherits from a child class or derived class.
Suppose three classes A,B,C. A is the superclass, B is the child of A, C is the
child of B. In other words we can say a chain of classes is called multilevel
inheritance
4. Hierarichal Inheritance
In heirarichal Inheritance, more than one child class is derived from a single
parent class. In other words, we can say one parent class and multiple child
classes.
5. Hybrid Inheritance
When the inheritance consist of different cominaions of inheritance hen it is
called hybrid inheritance, normally the hybrid inheritance is the combination of
multiple inheritance and hierarichal inheritance
Polymorphism
Polymorphism means a function with the same name but with different forms, and it
is a very important concept in OOPS.
len("programming")
len(['this' 'is', 'an', 'era', 'of', 'python', 'programming'])
len({'Name':'John', 'Address':'Altlanta'})
File handling
The file handling is a very important topic
syntax: to open the file is
f = file(filename, mode)
r : read mode
w : write mode
rt : read text mode
wt : write text mode
a : append mode
w+ : to read and write data
a+ : to append and read data from the file
Whenever you have to read a file you ahve to open it and after reading the file
always close it
when we try to open a file using 'w' mode or 'w+' mode then if the file does not
exists then the file gets created but if the file is already created then it get
deleted and new file of same name is created
with() operator to open the files

with open(file_name, mode) as f:
data = f.readlines()
for line in data:
words = line.split()
print(words)
Numpy
Numpy is called Numerical Python, it is the fundamental package for high
performance computing and data analysis. The NumPY provides the ndarray which is a
fast and space efficent multidimensional array providing vectorized airthmetic
operations and sophisicated broadcasting capabilities.
The N-dimensional array object or ndarray, which is fast, flexible container for
large data sets in python. Arrays enable you to perform mathematical operations on
whole block of data using the similar syntax.
character: []==> a set of characters==> example is [a-m]==>c/d/f/m

: . ==> any character except newline character '\n'==> [a-zA-Z0-9?@#&]
: ^ ==> starts with "^hello" ==> hello good morning, nn acceptable ==>
good morning
: * ==> zero or more occurences
: + ==> one or more occurences
: [âbc] ==> returns a match for any character EXCEPT a, b, c
: [0-9] ==> returns the match between 0 to 9 digits
: [0-6][0-8] ==> this will return a two digit number==> 00, 68, 79, 49
wearegoingoworkt a[a-zA-Z0-9]t$
allareworkingfinet â[a-zA-Z0-9]t$
â[a-zA-Z0-9]*[0-9]t
aSq3456tgHJK8tjhgjhgj898
a[a-zA-Z0-9]*[0-9]t
fgjeogjoee456836536feoifjoa12233thhff67
â[a-zA-Z0-9]*[0-9]t$
aSq3456tgHJK8t
a[a-zA-Z0-9]*[0-9]t$
fgjeogjoee456836536feoifjoa12233t
ggf5785hg4arfre45t
Web Scrapping:
Web scrapping is a automatic method to obtain large amount of data from a
particular website. The data is html is normally unstructured and we convert the
data into structured form and feed the data into table. There are different ways to
perform web scrapping but we have used here beautiful soup package to perform the
web scrapping.
Parsers: the parser are use to parse a unstrucutred data, we use two types of
parsers 1. html5, 2. lxml parser
Series and Pandas

A series is a one dimensional array-like object containing an array and it has one
more associated label or data label called index. The simplest Series is formed
only on array of data.
DataFrame
A dataframe represents a tabular, excel sheets like data structure containing an
ordered collection of columns, each of which has different value types. The data
frame has both row and column index.
LOC and iLOC

loc() and iloc() are two very imporrtant functional units to perform searching in
data frames. They are used in slicing data from the pandas DataFrame. The selection
based searching we can do using loc(), where as slicing can be done using iloc()
Data Visualization
Data Visulaization is the discipline of trying to understand data by placing it on
a visual context to get the patterns, trends and correlations that might not be
detected ithout visual representations.
Python offeres multiple great graphing libraries packed with lot of features
depending upon how much customized plot you want to make
To get the overview let us have some popular plotting libraries.
1. Matplotlib: It is a low level light weight provide lot of freedom
2. Pandas Visulaization: easy to use interface built on matplotlib
3. Seaborn: high level interface used to build exoctic graphs
4. Plotly: used to create interactive plots
Types of graphs
1. Scatter Plot : this plot is used to create scatter representation of data and it
use scatter method.
2. Line Plot: In matplotlib we can create a line chart by calling the plot method.
We can also plot multiplr columns in one graph by looping through the columns we
wnt plotting each column.
3. Histogram are basicaly used to visulaize the frequency distribution. In
matplotlib we can create a Histogram using the hist method.If we pass categorical
data then the histogram will make automatically
4. barplot : this pot needs frequency list seprately
Pandas Visualization
Searborn
HeatMap: A heatmap is a graphical reprsentation of data where values contained in a
matrix are represented as colors, heat map are prefect graphs for exploring the
correlation of features in a dataset
Box plots:
The box plots are the very effecient plots in detecting the outliers
Machine Learning
Data : The data is normally in the form of .csv files, but we can also have data in
excel file format
Machine Learning==> The machine learning composed of two words machine and
learning, where machine indicates an automated process which is trained on lot of
data to make it learn for prediciting the results.
Machine learning is an application of artificial intelligence that uses statistical
techniques to enable computers to learnand make decisionswithout being explicitily
programmed. In ML computers can learn from data spot patters and make judgements
with little assistant from humans.
Terminologies:
1. Model: the model is also known as hypothesis, a ml model is the real life
application of the problem and it is basically a mathematical representation.
2. Features: A feature is a measurable property also known as parameters the model
have some times the features are also called components.
3. Vector/Feature Vector: It is a multiple numeric feature. We use these feature
vectors as input to the machine learning models.
x1 = [160, 80, 27, 3]

x2 = [165, 79, 38, 1]
x3 = [170, 55, 36, 3]
x4 = [162, 72, 42, 4]
x5 = [177, 70, 41, 3]
x6 = [168, 88, 35, 1]
4. Training:
An algorithm takes a set of data known as training data as input. The learning
algorithm finds patterns in the input data and trains the model for expected
results known as target. " The output of the training data or process is the
machine learning model.
5. Prediction. Once the machine learning model is ready after training, it can be
fed with the test data to predict the output.
6. Target (Label): The value that the machine learning model has to predict is
called the target or label.
7. Overfitting: In this case the model performs well with the training data but it
fails when get tested on test data
8. Underfitting: when we have too few data to train the model
Error:
let us say we have an equation y = 3x+2
x = {1,2,3,4}
y = {5,8,11,14}
y_pred = 2.6*x + 1.5

y_pred = { 4.1, 6.7, 9.3, 11.9}
E = (y-y_pred)**2/4 = ((5-4.1)**2 + (8-6.7)**2 + (11-9.3)**2 + (14-11.9)**2)/4
" The process of reducing the error is called learning"

Supervised Learning algorithm:
Supervised learning algorithm is a class of problems that uses a model to learn the
mapping between the input and target variables. Applications consisting of the
training data describing the various input variables and target variables are
called supervied learning tasks.
Let the set of input variable be X and the target variable be y. A supervised
learning algorithm tries to learn a model which can predict the values correctly.
Unsupervised Learning algorithm:

In an Unsupervised learning algorithm the model tries to learn by itself and
recognize patterns and extract the realtionship among the data.
As in case of supervised learning algorithm we have a teacher, i.e. target value,
which is not there in Unsupervised learning.
In Unsupervised learning we use clustering. In clustering we group those data
elements which are more closer to each other.
S = [p, c, m]
s1 = [7, 6, 8]
s2 = [4, 7, 6]
s3 = [8, 9, 7]
s4 = [6, 5, 6]
D(s1,s2) = 2
D(s1,s3) = 1.66
D(s1,s4) = 1.33
D(s2,s3) = 2.33
D(s3,s4) = 2.33
D(s2,s4) =
Exploratory Data Analysis (EDA)
The EDA is a process of discovering patterns or sometimes describing the data by

means of statistical and visulaization techniques, in an ttemot to bring important
informations or aspects of the data into focus for further analysis. The raw data
can be skewed which is not a good sign because skewed data is unbalanced data. A
model built on such data is not balanced and its performance is suboptimal
Normally the new comers in a curiousity to build the ML models often miss this
important step.
General Statitical Analysis Results
1. the average age of the person who detected cancer is 52 years

2. age of operation is 62 years ; means person got operated when he attains the age
of 62 years.
3. A indicated by the 50th percentile, the median of positive axillary node is 1
4. Since we have verified from the normal curve that at first std the auxiallary
nodes of cancer can be 11, at second std the auxillary nodes of cancer can be 18,
the third std says that total auxillary nodes at maximum can be 25. therefore as
these std covers 65%, 95% and 99.3% of data so any value above 25 clearly indicates
outliers.
5. due to outliers we have a significant difference between mean and median, this
clearly shows that data is skewed
# after class-wise analysis we have found that

1. The average age at which person is operated for both cases is same
2. If the patient have approximately 3 aux. nodes then he has good chances of
survival
3. The auxiallary nodes more than 4 will indicate that the person will not survive
Visualization Analysis
1. Univariate Analysis : In univariate analysis as the name suggest is an analysis
carried out by consiering one vaiable at one time
Let us do the univearite analysis of features of data one by one to determine
correctly the survial status
We try to find out that among the three features which one is more useful in
determining survival status. The plot which we will going to use is distribution
plot with eah feature on X axis and y axis represent normalized density
Finding of the first distribution plot on basis of age is

1. Maximum patient are in the age groop of 40 to 60
2. There is a high overlap between the class labels. this simply implies that the
survial status of the pateint can not distinctive on basis of age
Finding of the second distribution plot on basis of operation_year is

Just like the first plot on Age we have seen that there is huge overlap so again we
cannot make a distintive conclusion regarding the srvival on the basis of operation
year
Finding of the third distribution plot on basis of auxillary_node is

This plot has shown significant difference among the curves of yes and no where
yes: blue, and no: in orange
although this plot also has good amount of oberlap but still we can make some
distncitive opinions ans observations:
1. Patient having 4 or fewer auxillary nodes a good majority of people survived
2. Patient having 5 or more have very few chances of survival.
The efeciency of auxillary nodes is verfied by furthe two plots

1. CDF Plot
2. Box Plot
Bi-Variate Analysis
Normalization:
sal = [10000,4000,2000,5000,8000,20000]
Age = [25,32,28,36,32,35]
y = w1*Age + w2*sal + bias
so normalzation is a process where we try to bring the data in the scale of 0 to 1

in min-max normalzation or in a scale to -2 to +2 in standard scaler
std_scaler = x-mu/sigma(std)
min-max = x-min/max-min
Important Questions:
1. What is the dimesion of data X.
2. how many components are there in on vector of X==> keep in mind that each sample
is a vector in ML.
3. Why we do normalization
4. Different types of normalization
5. What is the effect of normalization
Overfiting and Underfitting

Overfitting: The overfitting is the process where the model after training does not
perform well in testing. We can say that a model gives very good accuracy in
training but when it goes for validation testing it get fails. This process is
known as Overfitting. It happens because in an attempt to improve the accuracy of
the model we take too many parameters which causes over training of the features
resulting in high variance and low bias condition.
Whereas in opposite to it the underfitting is the condition where we have too few
parameters to get trained, when too few parameters would be there then we have to
increase the bias, thiscondition is called underfitting, and we also called it High
Variance and Low Bias for Overfitting, and Low vaiance and High Bias in
Underfitting collectively this concept is known as Bias_Variance trade off.
Cross Validation
If the data is not big then we use cross validation where data gets split into
blocks and cyclicly one block is being given for testing and remaining blocks are
given for training, and the process continues till all the blocks exhausted for
testing
Stratfied K-fold cross validation

The stratified k-fold cross validation is being used when we have imbalanced data ,
the imbalanced data is that when we dont have equal or nearly equal data of
different classes.
1. 500 ==> 'Yes'
2. 120 ==> 'No'
This data is imbalanced, so when we perform cross vaidation where we have splits
which determine how many plits have to made so strafied k-fols make assure that
every block in very split has same share of elemts from both classes.
Regularization
The regularization is used to avoid overfitting. The loss which occured when the
model predict the wrong class or regress the wrong values, then we say than the
model is giving us loss. Therefore the loss is the error and ther error further
increases when the model got overfitted.
The regularization is the mechanism where we add a penalty is the loss function
whcih keep on incresing as longer we train th model. This effort is also being used
to avoid over training of the model
Loss Function = 1/N(summation(pred_y(i)-test_y(i))**2 for i in range(N)) +

1/2(weight parameters)**2 ==> Ridge Regularization
Lasso Regularization
loss function = 1/N(summation(pred_y(i)-test_y(i))**2 for i in range(N)) +
1/2(weight parameters)
Classification
The classification is the technique to classify the data into two different classes
if we are talking about the binary classification. In classification we have
sigmoid function as the activation unit to classify data into two different
classes.
Q: TN: 50, FP: 10, FN: 5, TP:100 then calculate precision, recall and accuracy of
the models
Important interview questions

1. What is feature engineering
2. What is the difference between data preprocessing and exploratory data analysis
3. Explain the difference beween performance evaluation metrics between
classification and regresion models
4. What is a sigmoid activation function
5. What are non linear data structures
6. Give the difference betwen linear and non linear data structure
7. Difference betweeen parametric and non parametric models
Y = w1x1 + w2x2 + b
Decision Tree: A decision tree is a non parametric superised learning algorithm for
classification and regression tasks. The decision tree follows a hierarichal
strucutre consisting of a root node, branches, internal nodes.
root node: it is the top most node of the tree
branches: are used to connect the nodes
internal nodes: the internal are those which have atleast one child
leaf node: it is the node which has no child
Metrics for Decision Trees

Entropy is nothing but it is the measure of uncertainity or measure of disorder or
impurity
5N, 5Y
1. Entropy(root) = -P_yes*log(P_yes)-P_no*log(P_no)
= -(5/10*log(5/10))-(5/10*log(5/10)
= -(0.5*log(0.5))-(0.5*log(0.5))
= 0.69 is the approximate impurity if we took decision in this
data means 69% uncertainity
2. Entropy(weather)=(2N, 1Y), (3Y), (3N, 1Y)
entropy(weather, sunny) = -(1/3log(1/3)-(2/3log(2/3)) = 0.636
entropy(weather, cloudy) = -(3/3*np.log(3/3)) = 0.0
entropy(weather, rainy) = -() = 0.5623
Entropy(weather) = 0.3*0.636 + 0.3*0.0 + 0.4*0.5623 = 0.4172 ==> now the
uncertainity has been reduced to 40%
Information Gain = 0.69-0.41 = 0.28
Q. As we know that the root note which is using feature1 has got 8 yes and 4 no.
==> Entropy(Weather) = -(8/12*np.log(8/12))-(4/12*np.log(4/12))=0.6365
impurity for feature 2 has 5 yes and 2 no ==> N1
==> Entropy(weather, feature2) = -(5/7*np.log(5/7))-(2/7*np.log(2/7))=0.598
impurity for feature 3 has 3 yes and 2 no ==> N2
==> Entropy(weather, feature3) = -(3/5*np.log(3/5))-(2/5*np.log(2/5)) = 0.67
I.G = Entropy(Weather)-[7/12*Entropy(weather, feature2) + 5/12*Entropy(weather,
feature3)]
= 0.6365 - (0.58*0.598 + 0.41*0.67)
=0.014
Gini Impurity
Gini impurity is a measure used in decision tree algorithms to quantify a data set
impurity level or disorder.
It ranges from 0 to 0.5 where 0 indicate a perfectly pure node means all instances
belong to the same class i.e. either Yes or No.
and 0.5 indicate perfectly impure node with maximum impurity.
formula is Gini Impurity = 1-Gini
Gini also called gini index is used to calculate the the probalility of correct
classification but when we substract it from 1 the expression give us impurity
for eg we choose a feature which gives 3 yes for node1 and 2 no for node 2
seprately
G(node1) = 1-(3/3)**2 = 1-1 = 0
G(node2) = 1-(2/2)**2 = 1-1 = 0
G(N1) = 1-[(5/7)*(5/7) + (2/7)*(2/7)] = 1-(0.71*0.71 + 0.28*0.28) = 0.41

G(N2) = 1-[(3/5)*(3/5) + (2/5)*(2/5)] = 0.48
Total Gini impurity = 0.58*0.41 + 0.42*0.48 = 0.43
This data is downloaded from kaggle. It describes patient medical record data for
Pima Indians and wether they had an onset of diabetes within five years.
chances/onset : 1, not : 0
1. Number of times Pregnant
2. Plasma gulucose concentration
3. Blood pressure
4. triceps skin strength
5. serum insulin
6. BMI
7. Diabetes pedigree function
8. Age
Improve the performance of Models using ensemble methods

1. How to use bagging ensemble method such as bagged decision trees, random forest
etc.
2. How to use boosting ensemble methods such as ada boost and gradient boosting
3. How to use voting classifier
1. Bagging: Building multiple models typically of same type using different

subsamples of training data sets
2. Boosting: Bulding multiple models typically of same type but each of which
learns to fix the prediction errors of a prior model in the sequence of models
3. Voting: Building multiple models typically of different types and simple
statistics like mean of all output to predict the output or maximm voting for
prediction in case of classification
Bagging is made of two important words 1. Bootstrap 2. Aggregation

High bias and Low variance models
1. High bias model result from not learning well enough from the data. The model
just have a superficial knowledge and training and hence the future prediction will
be unrelatedand incorrect.
2. High varaince model results from learning the data too well. It varies with each
data point. Hence it is impossible to predict the next point accurately.
Both of these conditions are not good for the health of the model and we cannot
genralize the the model properly.
Bootstrapping: The Bootstrappig involves resampling of the data with replacement

from the orginl data set. These subsets of data are taken from the initial data
sets. These subsets are also called bootstrapped data sets.
Random Forest CLassifier

Random Forest Classifier is an extension of bagging. samples of training data are
taken with replacement but the trees are constructed in a way that reduces the
correlation between indivisual classifiers
In Boosting ensemble algorithms , we create models which are essentially a sequence

of models that attempts to correct the mistakes of the models before them in the
sequence. Once created the model may be weighted by their accuracy and the results
are combined to create the output prediction.
The two main bosting algorithms are
1. AdaBoost
2. Gradient Boosting
Q. What is Hinge Loss

A hinge loss is used to measure the quality of classification done using support
vector machines
Voting Ensemble method

voting is the simplest but very effective way of combining the predictions from
multiple machine learning algorithms. It works by creating one or more standlone
models from the training data, then Voting Classifier can then wrap allof your
models and average the predictions of all sub models to give the final results.
So we will going to create a voting ensemble model for clasification using the
voting classifier.
1. Prepare the problem

a. Load Libraries
b. Load datasets
2. Summarize the data

a. Descriptive Statistics
b. Data Visualization
3. Prepare the data

a. Data Cleaning
b. Feature Selection
c. Data Transformation
4. Evaluate the algorithm

a. split out the test data and training data
b. Train the model
c. check the evaluation metrics
d. Compare it with other algorithms
5. Improve acuracy
a. Algorithm tunning
b. Ensembles
6. Finalize the model

a. Prediction on validation data set
b. Save model for later use
kNN : The kNN algorithm wat the most accurate model that was tested. Now we want to
get an idea of accuracy of the model on our validation data set. the closest
distance of the sample with either of the classes sampels will gives us idea that
our testing samle belong to which class.
In Support vector machines we have two variants

1. One vs One Classifier
2. One vs Rest Classifier
Ideally The SVM is designed for binary classification, but they are used for
multiclass classification using the above two varaints
One vs One CLassifier:
In OVO we have number of classifers equal to n(n-1)/2, where n is the number of
classes
[Red, Blue, Green, Yellow]
1. Binary classification : (Red, Blue) ==> red .60, blue, 0.40
2. Binary classification : (Red, Green) = r--> 0.78, g--> 0.22
3. Binary classification : (red, Yellow) ==> r-->0.65, y--> 0.35
4. Binary classification : (Blue, Yellow) ==> 0.32, y--> 0.78
5. Binary classification : (Blue, Green) ==> 0.21, 0.79

6. Binary classification : (Green Yellow) ==>0.12, 0.88
OVR (One Vs Rest)

1. Binary classification : red vs [blue, green, yellow]
2. Binary classification : blue vs [red, green, yellow]
3. Binary classification : green vs [red, blue, yellow]

Python Notes

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Python Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python Notes

Uploaded by

Copyright:

Available Formats

Features of Python:

1. Python is intrepreted: Interpreted means we can execute line by line code of

Split and Join

Exception Handling in python

Try, catch and finally blocks

Object Oriented Programming using Python

with() operator to open the files

character: []==> a set of characters==> example is [a-m]==>c/d/f/m

Series and Pandas

LOC and iLOC

x1 = [160, 80, 27, 3]

let us say we have an equation y = 3x+2

y_pred = 2.6*x + 1.5

E = (y-y_pred)**2/4 = ((5-4.1)**2 + (8-6.7)**2 + (11-9.3)**2 + (14-11.9)**2)/4

" The process of reducing the error is called learning"

Unsupervised Learning algorithm:

Exploratory Data Analysis (EDA)

The EDA is a process of discovering patterns or sometimes describing the data by

General Statitical Analysis Results

1. the average age of the person who detected cancer is 52 years

# after class-wise analysis we have found that

Finding of the first distribution plot on basis of age is

Finding of the second distribution plot on basis of operation_year is

Finding of the third distribution plot on basis of auxillary_node is

The efeciency of auxillary nodes is verfied by furthe two plots

y = w1*Age + w2*sal + bias

so normalzation is a process where we try to bring the data in the scale of 0 to 1

Overfiting and Underfitting

Stratfied K-fold cross validation

Loss Function = 1/N(summation(pred_y(i)-test_y(i))**2 for i in range(N)) +

Important interview questions

Metrics for Decision Trees

G(N1) = 1-[(5/7)*(5/7) + (2/7)*(2/7)] = 1-(0.71*0.71 + 0.28*0.28) = 0.41

Total Gini impurity = 0.58*0.41 + 0.42*0.48 = 0.43

Improve the performance of Models using ensemble methods

1. Bagging: Building multiple models typically of same type using different

Bagging is made of two important words 1. Bootstrap 2. Aggregation

Bootstrapping: The Bootstrappig involves resampling of the data with replacement

Random Forest CLassifier

In Boosting ensemble algorithms , we create models which are essentially a sequence

Q. What is Hinge Loss

Voting Ensemble method

1. Prepare the problem

2. Summarize the data

3. Prepare the data

4. Evaluate the algorithm

6. Finalize the model

In Support vector machines we have two variants

5. Binary classification : (Blue, Green) ==> 0.21, 0.79

OVR (One Vs Rest)

You might also like

E = (y-y_pred)2/4 = ((5-4.1)2 + (8-6.7)2 + (11-9.3)2 + (14-11.9)**2)/4

y = w1Age + w2sal + bias

G(N1) = 1-[(5/7)(5/7) + (2/7)(2/7)] = 1-(0.710.71 + 0.280.28) = 0.41

Total Gini impurity = 0.580.41 + 0.420.48 = 0.43