[go: up one dir, main page]

0% found this document useful (0 votes)
25 views36 pages

Task 3

Uploaded by

yasaswinisrit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views36 pages

Task 3

Uploaded by

yasaswinisrit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

DATA MINING TASK-3

3a. TO DEMONSTRATE DATA PREPROCESSING ON PREDEFINED WEKA DATASET DIABETES.ARFF

AIM: TO PERFORM OPERATIONS ON DIABETES.ARFF WITH HELP OF DATA MINING TOOL WEKA.

PROCEDURE: Click on the open file -> Go to C drive- > Go to Program files-> Go to the Weka
3.8.6. -> Click on data folder.

Choose the required dataset,i.e is DIiabetes.arff.

Click on edit to see


 OPERATIONS PERFORMED :

1)ADD: An instance filter that adds a new attribute to dataset.

To apply add operation,we have to click on choose and click on filters,unsupervised then add option.

Now right click on operation to click on to show properties ,then add data on attributeName.
AFTER

Click on edit to see


2.REMOVE: A filter that removes a range of attributes from the dataset

To apply add operation,we have to click on choose and click on filters,unsupervised then remove option.

Now right click on remove to click on show properties,then give value to the attributeIndices.
AFTER

Click on edit to see


3.COPY: An instance filter that copies a range of more capabilities attributes in a dataset

To apply add operation,we have to click on choose and click on filters,unsupervised then click on copy option.

Now right click on remove to click on show properties,then give range values to the attributeIndices.
Then click on edit to see
4.REPLACE MISSING VALUES: Replace all missing values for nominal and numeric attributes in a

dataset with the modes and means from the training data.

To apply add operation,we have to click on choose and click on filters,unsupervised then click on

Remove missing values option.

Now right click on remove to click on show properties,then click on ok as it updates default values.
Then clicl
5.REPLACE MISSING WITH USER CONSTANT: Replace all missing values for nominal and numeric attributes in

a dataset with user-supplied constant values.

Then right click on Remove missing values with user constant to show properties,then ,Feed data to
nominalStringReplacementValue and numericReplacementValue as follows.
Then click on edit to see

6.STANDARDIZE.: All numeric attributes in the given dataset to have zero mean and unit variance .
After click on ok to see the changes.
7.NORMALIZATION: All numeric values in the given dataset (apart from the class attribute,if set)
AFTER

Then click on edit to see


STATISTICS AND ITS VALUES:

OBSERVATION

 Relation:diabetes
 No.of.attributes=9
 No.of.missing values=0
 List the attribute names:
 1.preg 2.plas 3.pres 4.skin 5.insu 6.mass 7.pedi 8.age 9.class
 Yes it is balanced dataset.
3b.Create a student.arff dataset & demonstrate data preprocessing on it.
AIM: TO CREATE AN STUDENT TABLE WITH HELP OF DATA MINING TOOL WEKA.

DESCRIPTION: We need to create an student table with training dataset which includes the attributes like

name, age,id,branch,gender.

PROCEDURE:

Open notepad and type the following code


 OPERATIONS PERFORMED :

1)ADD: An instance filter that adds a new attribute to dataset.

To apply add operation,we have to click on choose and click on filters,unsupervised then add option.
AFTER

2.REMOVE: A filter that removes a range of attributes from the dataset

To apply add operation,we have to click on choose and click on filters,unsupervised then remove option.
AFTER

3.COPY: An instance filter that copies a range of more capabilities attributes in a dataset

To apply add operation,we have to click on choose and click on filters,unsupervised then click on copy option.
AFTER:
4.REPLACEE MISSING VALUES: Replace all missing values for nominal and numeric attributes in a

dataset with the modes and means from the training data.

To apply add operation,we have to click on choose and click on filters,unsupervised then click on

Remove missing values option.

A
AFTER

5.REPLACE MISSING WITH USER CONSTANT: Replace all missing values for nominal and numeric attributes in

a dataset with user-supplied constant values.


Then right click on Remove missing values with user constant to show properties,then ,Feed data to
nominalStringReplacementValue and numericReplacementValue as follows.

AFTER
6.NORMALIZATION: All numeric values in the given dataset (apart from the class attribute,if set)
AFTER

7.STANDARDIZE.: All numeric attributes in the given dataset to have zero mean and unit variance .
AFTER

OBSERVATION

 elation:student
 No.of.attributes=5
 No.of.missing values=0
 List the attribute names:
 1.name, 2.age,3.id,4.branch,5.gender
 It is a balanced dataset
3c.Create a weather.arff dataset & demonstrate data preprocessing on it.
AIM: TO CREATE AN WEATHER TABLE WITH HELP OF DATA MINING TOOL WEKA.

DESCRIPTION: We need to create an weather table with training dataset which includes the attributes like

Outlook,temperature,humidity,windy,play.

PROCEDURE:Open notepad and type the following code


 OPERATIONS PERFORMED :

1)ADD: An instance filter that adds a new attribute to dataset.

To apply add operation,we have to click on choose and click on filters,unsupervised then add option.

AFTER
2.REMOVE: A filter that removes a range of attributes from the dataset

To apply add operation,we have to click on choose and click on filters,unsupervised then remove option.

AFTER
3.COPY: An instance filter that copies a range of more capabilities attributes in a dataset

To apply add operation,we have to click on choose and click on filters,unsupervised then click on copy option.

AFTER
4.REPLACEE MISSING VALUES: Replace all missing values for nominal and numeric attributes in a

dataset with the modes and means from the training data.

To apply add operation,we have to click on choose and click on filters,unsupervised then click on

Remove missing values option.

AFTER
5.REPLACE MISSING WITH USER CONSTANT: Replace all missing values for nominal and numeric attributes in

a dataset with user-supplied constant values.

AFTER
6.STANDARDIZE.: All numeric attributes in the given dataset to have zero mean and unit variance .

AFTER
7.NORMALIZATION: All numeric values in the given dataset (apart from the class attribute,if set)

AFTER
Click on edit to see

Statistics and its values:

OBSERVATION

 elation:weather
 No.of.attributes=5
 No.of.missing values=0
 List the attribute names:
 1.outlook 2.temperature 3.humidity 4.windy 5.play.
 It is a balanced dataset

SUBMITTED BY:22A81A0654

You might also like