A09Ass06 - Jupyter Notebook
A09Ass06 - Jupyter Notebook
localhost:8888/notebooks/A09Ass06.ipynb 1/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [3]: iris
Out[3]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Id 150 non-null int64
1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB
In [5]: iris.isnull().sum()
Out[5]: Id 0
SepalLengthCm 0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0
dtype: int64
localhost:8888/notebooks/A09Ass06.ipynb 2/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [6]: iris.Species.value_counts()
Out[6]: Species
Iris-setosa 50
Iris-versicolor 50
Iris-virginica 50
Name: count, dtype: int64
localhost:8888/notebooks/A09Ass06.ipynb 3/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [7]: iris.tail(60)
localhost:8888/notebooks/A09Ass06.ipynb 4/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
Out[7]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
localhost:8888/notebooks/A09Ass06.ipynb 5/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [8]: iris.describe()
Out[8]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
In [9]: iris.duplicated().sum()
Out[9]: 0
localhost:8888/notebooks/A09Ass06.ipynb 6/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
localhost:8888/notebooks/A09Ass06.ipynb 7/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [11]: plt.figure(figsize=(20,50))
sns.boxplot(iris)
localhost:8888/notebooks/A09Ass06.ipynb 8/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
localhost:8888/notebooks/A09Ass06.ipynb 9/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
localhost:8888/notebooks/A09Ass06.ipynb 10/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [12]: sns.boxplot(iris.SepalWidthCm)
localhost:8888/notebooks/A09Ass06.ipynb 11/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [13]: sns.distplot(iris.SepalWidthCm)
C:\Users\alisu\AppData\Local\Temp\ipykernel_9460\3103411925.py:1: UserWar
ning:
For a guide to updating your code to use the new functions, please see
https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751 (http
s://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751)
sns.distplot(iris.SepalWidthCm)
In [14]: #To deal with outliers we can use z-score method as data
#of sepalwidth is normally distributed
localhost:8888/notebooks/A09Ass06.ipynb 12/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [18]: iris["sepalwidth_zscore"]
Out[18]: 0 1.028611
1 -0.124540
2 0.336720
3 0.106090
4 1.259242
...
145 -0.124540
146 -1.277692
147 -0.124540
148 0.797981
149 -0.124540
Name: sepalwidth_zscore, Length: 150, dtype: float64
In [19]: iris
Out[19]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species sepalwidt
Iris-
0 1 5.1 3.5 1.4 0.2
setosa
Iris-
1 2 4.9 3.0 1.4 0.2
setosa
Iris-
2 3 4.7 3.2 1.3 0.2
setosa
Iris-
3 4 4.6 3.1 1.5 0.2
setosa
Iris-
4 5 5.0 3.6 1.4 0.2
setosa
Iris-
145 146 6.7 3.0 5.2 2.3
virginica
Iris-
146 147 6.3 2.5 5.0 1.9
virginica
Iris-
147 148 6.5 3.0 5.2 2.0
virginica
Iris-
148 149 6.2 3.4 5.4 2.3
virginica
Iris-
149 150 5.9 3.0 5.1 1.8
virginica
localhost:8888/notebooks/A09Ass06.ipynb 13/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [20]: iris[iris.sepalwidth_zscore>3]
Out[20]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species sepalwidth_
Iris-
15 16 5.7 4.4 1.5 0.4 3.
setosa
In [22]: iris[iris.sepalwidth_zscore<-3]
Out[22]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species sepalwidth_zsc
In [23]: iris.sepalwidth_zscore.describe()
localhost:8888/notebooks/A09Ass06.ipynb 14/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [27]: iris.head(20)
Out[27]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species sepalwidth_
Iris-
0 1 5.1 3.5 1.4 0.2 1.
setosa
Iris-
1 2 4.9 3.0 1.4 0.2 -0.
setosa
Iris-
2 3 4.7 3.2 1.3 0.2 0.
setosa
Iris-
3 4 4.6 3.1 1.5 0.2 0.
setosa
Iris-
4 5 5.0 3.6 1.4 0.2 1.
setosa
Iris-
5 6 5.4 3.9 1.7 0.4 1.
setosa
Iris-
6 7 4.6 3.4 1.4 0.3 0.
setosa
Iris-
7 8 5.0 3.4 1.5 0.2 0.
setosa
Iris-
8 9 4.4 2.9 1.4 0.2 -0.
setosa
Iris-
9 10 4.9 3.1 1.5 0.1 0.
setosa
Iris-
10 11 5.4 3.7 1.5 0.2 1.
setosa
Iris-
11 12 4.8 3.4 1.6 0.2 0.
setosa
Iris-
12 13 4.8 3.0 1.4 0.1 -0.
setosa
Iris-
13 14 4.3 3.0 1.1 0.1 -0.
setosa
Iris-
14 15 5.8 4.0 1.2 0.2 2.
setosa
Iris-
16 17 5.4 3.9 1.3 0.4 1.
setosa
Iris-
17 18 5.1 3.5 1.4 0.3 1.
setosa
Iris-
18 19 5.7 3.8 1.7 0.3 1.
setosa
Iris-
19 20 5.1 3.8 1.5 0.3 1.
setosa
Iris-
20 21 5.4 3.4 1.7 0.2 0.
setosa
localhost:8888/notebooks/A09Ass06.ipynb 15/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [28]: iris.info()
<class 'pandas.core.frame.DataFrame'>
Index: 149 entries, 0 to 149
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Id 149 non-null int64
1 SepalLengthCm 149 non-null float64
2 SepalWidthCm 149 non-null float64
3 PetalLengthCm 149 non-null float64
4 PetalWidthCm 149 non-null float64
5 Species 149 non-null object
6 sepalwidth_zscore 149 non-null float64
dtypes: float64(5), int64(1), object(1)
memory usage: 9.3+ KB
In [30]: iris.drop(["sepalwidth_zscore"],axis='columns')
Out[30]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
localhost:8888/notebooks/A09Ass06.ipynb 16/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [31]: iris
Out[31]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species sepalwidt
Iris-
0 1 5.1 3.5 1.4 0.2
setosa
Iris-
1 2 4.9 3.0 1.4 0.2
setosa
Iris-
2 3 4.7 3.2 1.3 0.2
setosa
Iris-
3 4 4.6 3.1 1.5 0.2
setosa
Iris-
4 5 5.0 3.6 1.4 0.2
setosa
Iris-
145 146 6.7 3.0 5.2 2.3
virginica
Iris-
146 147 6.3 2.5 5.0 1.9
virginica
Iris-
147 148 6.5 3.0 5.2 2.0
virginica
Iris-
148 149 6.2 3.4 5.4 2.3
virginica
Iris-
149 150 5.9 3.0 5.1 1.8
virginica
In [32]: iris.drop(["sepalwidth_zscore"],axis='columns',inplace=True)
localhost:8888/notebooks/A09Ass06.ipynb 17/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [33]: iris
Out[33]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
In [36]: correlation_matrix
Out[36]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
localhost:8888/notebooks/A09Ass06.ipynb 18/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [45]: iris
Out[45]:
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
localhost:8888/notebooks/A09Ass06.ipynb 19/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
Step 4 :
In [47]: x
Out[47]:
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
localhost:8888/notebooks/A09Ass06.ipynb 20/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [49]: y
Out[49]:
Species
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
... ...
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica
In [51]: xtrain
Out[51]:
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
localhost:8888/notebooks/A09Ass06.ipynb 21/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [52]: xtest
Out[52]:
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
In [53]: len(xtest)
Out[53]: 30
localhost:8888/notebooks/A09Ass06.ipynb 22/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [54]: ytrain
Out[54]:
Species
92 Iris-versicolor
115 Iris-virginica
14 Iris-setosa
45 Iris-setosa
90 Iris-versicolor
... ...
76 Iris-versicolor
44 Iris-setosa
23 Iris-setosa
73 Iris-versicolor
16 Iris-setosa
localhost:8888/notebooks/A09Ass06.ipynb 23/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [55]: ytest
Out[55]:
Species
116 Iris-virginica
49 Iris-setosa
3 Iris-setosa
43 Iris-setosa
127 Iris-virginica
25 Iris-setosa
109 Iris-virginica
12 Iris-setosa
128 Iris-virginica
141 Iris-virginica
5 Iris-setosa
55 Iris-versicolor
129 Iris-virginica
36 Iris-setosa
131 Iris-virginica
83 Iris-versicolor
26 Iris-setosa
88 Iris-versicolor
126 Iris-virginica
144 Iris-virginica
79 Iris-versicolor
95 Iris-versicolor
60 Iris-versicolor
54 Iris-versicolor
2 Iris-setosa
42 Iris-setosa
66 Iris-versicolor
93 Iris-versicolor
24 Iris-setosa
46 Iris-setosa
In [56]: len(ytest)
Out[56]: 30
localhost:8888/notebooks/A09Ass06.ipynb 24/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [58]: model.fit(xtrain,ytrain)
C:\Users\alisu\anaconda3\Lib\site-packages\sklearn\utils\validation.py:11
84: DataConversionWarning: A column-vector y was passed when a 1d array w
as expected. Please change the shape of y to (n_samples, ), for example u
sing ravel().
y = column_or_1d(y, warn=True)
Out[58]: ▾ GaussianNB
GaussianNB()
In [59]: ypredict=model.predict(xtest)
In [60]: ypredict
In [61]: type(ypredict)
Out[61]: numpy.ndarray
In [62]: type(ytest)
Out[62]: pandas.core.frame.DataFrame
localhost:8888/notebooks/A09Ass06.ipynb 25/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [63]: ytest.values
Out[63]: array([['Iris-virginica'],
['Iris-setosa'],
['Iris-setosa'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-versicolor'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-versicolor'],
['Iris-setosa'],
['Iris-versicolor'],
['Iris-virginica'],
['Iris-virginica'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-setosa'],
['Iris-setosa'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-setosa'],
['Iris-setosa']], dtype=object)
In [69]: matrix
In [70]: precision=precision_score(ytest,ypredict,average="micro")
In [71]: precision
Out[71]: 1.0
localhost:8888/notebooks/A09Ass06.ipynb 26/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [72]: recall=recall_score(ytest,ypredict,average="micro")
In [73]: recall
Out[73]: 1.0
In [74]: f1_score(ytest,ypredict,average="micro")
Out[74]: 1.0
In [76]: print(classification_report(ytest,ypredict))
accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30
In [77]: q=[[4.6,3.1,1.5,0.2]]
In [78]: model.predict(q)
C:\Users\alisu\anaconda3\Lib\site-packages\sklearn\base.py:464: UserWarni
ng: X does not have valid feature names, but GaussianNB was fitted with f
eature names
warnings.warn(
In [79]: p=[[4.6,3.1,1.5,1.2]]
In [80]: model.predict(p)
C:\Users\alisu\anaconda3\Lib\site-packages\sklearn\base.py:464: UserWarni
ng: X does not have valid feature names, but GaussianNB was fitted with f
eature names
warnings.warn(
localhost:8888/notebooks/A09Ass06.ipynb 27/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [84]: ytest
Out[84]: array([['Iris-virginica'],
['Iris-setosa'],
['Iris-setosa'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-versicolor'],
['Iris-virginica'],
['Iris-setosa'],
['Iris-virginica'],
['Iris-versicolor'],
['Iris-setosa'],
['Iris-versicolor'],
['Iris-virginica'],
['Iris-virginica'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-setosa'],
['Iris-setosa'],
['Iris-versicolor'],
['Iris-versicolor'],
['Iris-setosa'],
['Iris-setosa']], dtype=object)
C:\Users\alisu\anaconda3\Lib\site-packages\sklearn\utils\validation.py:11
84: DataConversionWarning: A column-vector y was passed when a 1d array w
as expected. Please change the shape of y to (n_samples, ), for example u
sing ravel().
y = column_or_1d(y, warn=True)
Out[85]:
▾ MultinomialNB
MultinomialNB()
In [86]: ypredictmulti=modelmulti.predict(xtest)
localhost:8888/notebooks/A09Ass06.ipynb 28/29
4/10/24, 10:57 PM A09Ass06 - Jupyter Notebook
In [87]: ypredictmulti
In [88]: confusion_matrix(ytest,ypredictmulti)
In [89]: print(classification_report(ytest,ypredictmulti))
accuracy 0.90 30
macro avg 0.89 0.89 0.89 30
weighted avg 0.90 0.90 0.90 30
In [90]: precision_score(ytest,ypredictmulti,average="micro")
Out[90]: 0.9
In [91]: recall_score(ytest,ypredictmulti,average="micro")
Out[91]: 0.9
In [92]: f1_score(ytest,ypredictmulti,average="micro")
Out[92]: 0.9
In [ ]:
localhost:8888/notebooks/A09Ass06.ipynb 29/29