Data Preprocessing Report

The report details a project on World Population Statistics focusing on the Iris flower dataset, which includes physical parameters of three flower species: Versicolor, Setosa, and Virginica. The dataset consists of 150 observations with features such as sepal length, sepal width, petal length, and petal width, aimed at classifying the flower species. Various data preprocessing techniques and libraries, including Pandas, Numpy, and Matplotlib, are utilized for analysis and visualization.

Uploaded by

mikeyfirasath201

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views6 pages

Data Preprocessing Report

Uploaded by

mikeyfirasath201

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Data Preprocessing Report

Title of the Project: World Population Statistic

Submitted by: F G Firasath

Name of the students:

1: N. Vamsi Nadh-230

2: F G Firasath -233
3: B Dhanush-291

Under the Guidance of

Dr. Tina Babu

Bangalore
Problem statement
This data set consists of the physical parameters of three species of flower
— Versicolor, Setosa and Virginica. The numeric parameters which the
dataset contains are Sepal width, Sepal length, Petal width and Petal
length. In this data we will be predicting the classes of the flowers based on
these parameters.The data consists of continuous numeric values which
describe the dimensions of the respective features. We will be training the
model based on these features.
Dataset Experimented:
1.Name of the Dataset:
This dataset contains Iris flower dataset for the research and to identify the
different rare species. In this dataset there are sepal length, sepal width, petal
length and petal width. This dataset has the data of the flower of the iris.
2.Features:
In this dataset as we are taking the different iris flower types from the research
for the 150 different types of floweres
The feature are:
Id
 SepalLengthCm
 SepalWidthCm
 PetalLengthCm
 PetalWidthCm
 Species
3.Observation:
Number of observation in the dataset are 150 as we are taking
150 flower
4.Type of Dataset:
This dataset belongs to the classification dataset.
Classification, In this dataset we are taking the sepal length,
sepat width, petal length and petal width are classified for the
flower of the iris flower dataset.
Data Preprocessing Techniques:
Libraries used:
1.Pandas- This libraire function provides data structures like Data Frame and
Series, which are efficient for handling structured data.
2.Numpy-It provides support for large, multi-dimensional arrays and matrices,
along with a collection of mathematical functions to operate on these arrays
efficiently.
3.Matplotib-It provides a wide range of plotting functions to generate various
types of plots, including line plots, scatter plots, bar plots, histograms, heatmaps,
and more.
4.Seaborn-Seaborn simplifies the process of creating complex visualizations by
offering functions that automatically handle tasks such as data aggregation and
summarization, as well as styling and colour palettes.
5.Plotly.express-It display statistical information encoded in a color palette.
Data preprocessing :
1.Importing Libraries:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white", color_codes=True)
import os
print(os.listdir("../input/"))
2.Importing the Dataset:
iris = pd.read_csv("../input/Iris.csv")
iris.head()

df.shape
From this, we got to know that there are 150 rows of data available and for each
row, we have 5 different features or columns.
3.Data Analysis:
Scatter plot of Iris features:
iris.plot(kind="scatter", x="SepalLengthCm", y="SepalWidthCm")

Andrews curves-this involve using attributes of samples as coefficients for four

iris series.
from pandas.plotting import andrews_curves
andrews_curves(iris.drop("Id", axis=1), "Species")

Rizal and The Underside of Philippine Hi
67% (3)
Rizal and The Underside of Philippine Hi
60 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
Task 1
No ratings yet
Task 1
14 pages
Practical No - 1
No ratings yet
Practical No - 1
5 pages
ML LabReport Final Index Edited
No ratings yet
ML LabReport Final Index Edited
35 pages
10
No ratings yet
10
7 pages
Lab Cs
No ratings yet
Lab Cs
38 pages
iris-dataset-project-report_compress
No ratings yet
iris-dataset-project-report_compress
16 pages
Task 1 Iris Flower Classification Using Machine Learning
No ratings yet
Task 1 Iris Flower Classification Using Machine Learning
10 pages
KRAI LabManual
No ratings yet
KRAI LabManual
77 pages
EDA AnalysisA
No ratings yet
EDA AnalysisA
15 pages
Data Science Project
No ratings yet
Data Science Project
31 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
Fo DS
No ratings yet
Fo DS
9 pages
DSBDA LAB_3_1737952797670
No ratings yet
DSBDA LAB_3_1737952797670
9 pages
21033570029_dm file kashish
No ratings yet
21033570029_dm file kashish
40 pages
Exno 4
No ratings yet
Exno 4
13 pages
Machine Learning in Python
No ratings yet
Machine Learning in Python
5 pages
BT-2016 SEM-IV Project Report (Review 1)
No ratings yet
BT-2016 SEM-IV Project Report (Review 1)
42 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
ML#07
No ratings yet
ML#07
21 pages
ML n PY Programs
No ratings yet
ML n PY Programs
17 pages
Data Minig Lab File
No ratings yet
Data Minig Lab File
25 pages
Assignment 4 r Program1
No ratings yet
Assignment 4 r Program1
11 pages
Univariate and Multivariate Data Exploration
No ratings yet
Univariate and Multivariate Data Exploration
26 pages
DM Guidelines 14jan2022
No ratings yet
DM Guidelines 14jan2022
5 pages
1613101309_JAYESH BANSAL_FinalProjectReport - Jayesh Bansal
No ratings yet
1613101309_JAYESH BANSAL_FinalProjectReport - Jayesh Bansal
38 pages
ml file syllabus
No ratings yet
ml file syllabus
43 pages
Chap5_wei.ipynb - Colab
No ratings yet
Chap5_wei.ipynb - Colab
29 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
DS Journal-1
No ratings yet
DS Journal-1
25 pages
DS Journal_Final
No ratings yet
DS Journal_Final
37 pages
Support Vector Machine (SVM Classifier) Implemenation in Python With Scikit-Learn
No ratings yet
Support Vector Machine (SVM Classifier) Implemenation in Python With Scikit-Learn
21 pages
ml lab external qp
No ratings yet
ml lab external qp
2 pages
Asset-V1 VIT+MBA109+2020+type@asset+block@Introductio To ML Using Python
No ratings yet
Asset-V1 VIT+MBA109+2020+type@asset+block@Introductio To ML Using Python
7 pages
Presentation 1
No ratings yet
Presentation 1
30 pages
1. Data Wrangling 1
No ratings yet
1. Data Wrangling 1
4 pages
Ankit Python
No ratings yet
Ankit Python
26 pages
TC2-Lab Manual
No ratings yet
TC2-Lab Manual
35 pages
ADS_EXP_1_CODE
No ratings yet
ADS_EXP_1_CODE
3 pages
Experiment-2-1-Ml Kritika
No ratings yet
Experiment-2-1-Ml Kritika
11 pages
Datamining 2
No ratings yet
Datamining 2
54 pages
DSBDA Lab Assignment No 10
No ratings yet
DSBDA Lab Assignment No 10
3 pages
3. Descriptive Statistics
No ratings yet
3. Descriptive Statistics
3 pages
SUMITs MINOR REPORT
No ratings yet
SUMITs MINOR REPORT
16 pages
DSML Problem Statements
No ratings yet
DSML Problem Statements
8 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
Machine
No ratings yet
Machine
10 pages
ML-3
No ratings yet
ML-3
24 pages
Practical No.-01
No ratings yet
Practical No.-01
25 pages
ADS_EXP_3 (1)
No ratings yet
ADS_EXP_3 (1)
7 pages
AI Project-1 - 21L-7744 21L-5433
No ratings yet
AI Project-1 - 21L-7744 21L-5433
5 pages
Iris Classification
No ratings yet
Iris Classification
6 pages
Discretization Problem Statement
No ratings yet
Discretization Problem Statement
3 pages
Fundamentals of Data Science Students
No ratings yet
Fundamentals of Data Science Students
52 pages
ML R Experiment1
No ratings yet
ML R Experiment1
10 pages
Lab 6
No ratings yet
Lab 6
4 pages
PR Final File
No ratings yet
PR Final File
70 pages
Ass-1 Prac
No ratings yet
Ass-1 Prac
23 pages
MLPY 2
No ratings yet
MLPY 2
18 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
8 Process of Stylistic Analysis
No ratings yet
8 Process of Stylistic Analysis
8 pages
DAR Question Bank 1
No ratings yet
DAR Question Bank 1
2 pages
1.theoretical Phonetics of The English Language
No ratings yet
1.theoretical Phonetics of The English Language
15 pages
10 Ms Word2
No ratings yet
10 Ms Word2
2 pages
COMMON ERRORS NARRATION
No ratings yet
COMMON ERRORS NARRATION
60 pages
Speakout Pre-Intermediate Unit 3.1
No ratings yet
Speakout Pre-Intermediate Unit 3.1
2 pages
Note Retail
No ratings yet
Note Retail
33 pages
Structure and Written Expression: Little Big Man, Lauderdale Managed To Find
No ratings yet
Structure and Written Expression: Little Big Man, Lauderdale Managed To Find
30 pages
Đề 05
No ratings yet
Đề 05
26 pages
Index: BGP Routing Part I: BGP and Multi-Homing
No ratings yet
Index: BGP Routing Part I: BGP and Multi-Homing
26 pages
Mil Diagnostic Examination
No ratings yet
Mil Diagnostic Examination
2 pages
LEIDO The Female Body - Perspectives of Latin American Artists - Amador Gómez-Quintero, Raysa Elena, 1949 - Pérez Bustillo, - 200
No ratings yet
LEIDO The Female Body - Perspectives of Latin American Artists - Amador Gómez-Quintero, Raysa Elena, 1949 - Pérez Bustillo, - 200
168 pages
When Did God Beget His Son?: Isaiah 7:14
No ratings yet
When Did God Beget His Son?: Isaiah 7:14
16 pages
House Automation Using Telegram
No ratings yet
House Automation Using Telegram
17 pages
SB 3.33.23-26
No ratings yet
SB 3.33.23-26
3 pages
Christ The Healer: "The Sower Sows The Word"
No ratings yet
Christ The Healer: "The Sower Sows The Word"
15 pages
Listening Test 3
No ratings yet
Listening Test 3
4 pages
Computer Window 10 Assignment
No ratings yet
Computer Window 10 Assignment
3 pages
Adaptive Impedance Matching and Antenna Tuning For Green Software-Defined and Cognitive Radio
No ratings yet
Adaptive Impedance Matching and Antenna Tuning For Green Software-Defined and Cognitive Radio
4 pages
Phoneme and Allophone: The Nexus Between Phonetics and Phonology
No ratings yet
Phoneme and Allophone: The Nexus Between Phonetics and Phonology
59 pages
Benchmark Design Considerations
No ratings yet
Benchmark Design Considerations
6 pages
Eyes in The Skies - The Story of The Hubble Space Telescope
No ratings yet
Eyes in The Skies - The Story of The Hubble Space Telescope
8 pages
Sem 5
No ratings yet
Sem 5
2 pages
Intercepting #Ajax Requests in #CEFSharp (Chrome For C#) Network Programming in
No ratings yet
Intercepting #Ajax Requests in #CEFSharp (Chrome For C#) Network Programming in
8 pages
List IV Booklet 2013 Preface
No ratings yet
List IV Booklet 2013 Preface
15 pages
CHARACTER LIST OF KARNAD'S HAYAVADANA_051311
No ratings yet
CHARACTER LIST OF KARNAD'S HAYAVADANA_051311
6 pages
Mat Grade 4 Term 1
No ratings yet
Mat Grade 4 Term 1
8 pages
L5A_Main_Template[1]
No ratings yet
L5A_Main_Template[1]
7 pages
CG Practical File
No ratings yet
CG Practical File
63 pages