[go: up one dir, main page]

0% found this document useful (0 votes)
77 views81 pages

DWDM Complete Record

Uploaded by

Hemanth Kumar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views81 pages

DWDM Complete Record

Uploaded by

Hemanth Kumar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 81

Department of CSE DWDM Lab Record Roll Number: 21311A05XX

I. Starting Informatica PowerCenter Express


Step 1: After booting the system up, go to Start Menu and run Start Informatica Services with
Administrator privileges.

Fig. 1.1: Step 1.1

Step 2: Once the batch file runs its course, navigate to localhost:7009 in your favourite browser. This is
the login page for the Administrator Server. Login with the credentials that were supplied during the
install process.

Fig 1.2: Step 1.2

III CSE-H Page No: 1


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 3: Once the server page starts up, we are ready to Launch Informatica Developer from the Search
Bar.

Fig 1.3: Step 1.3


Step 4: Once Informatica Developer loads up, connect to Model Repository using the credentials as
specified in the server and you are good to go.

Fig 1.4: Step 1.4


Aim: To apply Filter Transformation to a Data Set and filter out tuples matching a certain criterion.

III CSE-H Page No: 2


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Theory: The filter transformation takes one table as an input and writes into another table all the values
that match a certain criterion. The tuples which do not meet the criteria are left as they are.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.

Step 1

Step 2: Start up Informatica Services and Launch Informatica Developer.

Step 3: Import your flat files (both source and target files) into mirmo1 project. Click on Next and then
Next again. In the next dialog box that opens, check the “Import Column Names from the first line”
checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

Step 3

III CSE-H Page No: 3


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 3

Step 4: Once the files have been imported, create a new mapping and name it Filter_Mapping.
Step 5: Import the source file with read mode and target file as write mode and drag and drop the Filter
Transformation from the Palette into the workspace. Give the necessary connections.

Step 5

Step 6: Make sure that Filter is selected. Then navigate to Properties  Filter and specify the condition
in the Filter Condition box. Click on Validate Expression at the right corner of the Properties Tab, to
check the validity of the Filter Condition.

Step 6

III CSE-H Page No: 4


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 7: Click on the Data Viewer Tab of the Properties Explorer and select Run. If the Transformation is
successful, then all entries matching the condition are displayed. Then click on Run in the menu bar and
select Run Mapping.

Step 7

Step 8: Make sure that the connections are intact. Now Run the Mapping. Then navigate to “F:\
Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is installed on my system, F:
Drive), and you can find your output target file. Open it up to see the output.
Output:

Filter Transformation for the given condition (SAL>50000) was successful.

III CSE-H Page No: 5


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Aim: To apply Router Transformation to a Flat File and filter out tuples matching a certain criterion into
different tables.

Theory: Filter Transformation only filters out tuples matching a certain criterion to only one table, but
the other tuples are left untouched. Router Transformation helps to route data based on some certain
criteria to multiple tables.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.

Step 1

Step 2: Start up Informatica Services and Launch Informatica Developer.

Step 3: Import your flat files (both source and target files) into mirmo2 project.

Step 3

III CSE-H Page No: 6


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

Step 3
Step 4: Once the files have been imported, create a new mapping and name it Mapping_Router.

Step 5: Drag SOURCE table from Object Explorer into the Mapping_Router tab. Set Physical Data Object
Access to Read as we use this table only to take inputs. Then drag TGT1, TGT2, TGT3 tables and set
Physical Data Object Access to Write as we use these table to store the output of the router
transformation. Select Router transformation from the Palette and drag it onto the Mapping Tab. Select
all entries from Read_SOURCE and drop it onto the Router.

Step 5

Step 6: Make sure Router is selected in the Mapping_Router Tab. Navigate to Properties  Groups in
the Properties Bar and click on New button to add a new group. Then in the GROUP FILTER CONDITION
for “Group” click on the small arrow in the corner and give the filter condition for the group. Create as
many groups as required.

III CSE-H Page No: 7


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 6
Step 7: Select “Group” (Ctrl + A) and map it to “Write_TGT1”. Similarly map “Group1” to “Write_TGT2”
and “Default” to “Write_TGT3”

Step 7

Step 8: Make sure Router is selected. Then click on Run in the menu bar and select Run Mapping.

Step 8

Step 9: Then navigate to “F:\Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is


installed on my system, F: Drive), and you can find your output target files. Open them up to see the
output.

Output:

Router Transformation was successful.

III CSE-H Page No: 8


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Aim: To apply an Expression Transformation on a flat file to evaluate two columns into an output
column in the target table.

Theory: Expression transformation is a connected, passive transformation used to calculate values on a


single row. Examples of calculations are concatenating the first and last name, adjusting the employee
salaries, converting strings to date etc. Expression transformation can also be used to test conditional
statements before passing the data to other transformations.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.

Step 1

Step 2: Start up Informatica Services and Launch Informatica Developer.


Step 3: Import your flat files (both source and target files) into mirmo2 project.

Step 3

Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

III CSE-H Page No: 9


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 3
Step 4: Once the files have been imported, create a new mapping and name it Expression_Mapping.
Step 5: Drag the source (EMPSRC) table into the workspace (in read mode) and the target (EMPTGT1)
table (in write mode). Also drag the Expression Transformation from the palette.

Step 5

Step 6: Drag and drop the values of the target table into the Expression Transformation. Also link up the
input table and the output table via the Expression Transformation appropriately. Make sure that the
Transformation is selected. Then navigate to Properties  Ports in the Properties Bar. Unselect the
Input Mark for TOTALSAL. This ensures that it can only act as an output port after evaluation of the
expression specified.

Step 6
Step 7: In the Expression Section of the entry, click on the small arrow head to open up the Expression
Dialog Box and enter the following expression: “IIF(ISNULL(COMM), SAL+COMM,

III CSE-H Page No: 10


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
SAL+COMM)” This expression checks if the Commission field is NULL, then returns only salary as the
result, else returns Salary + Commission as result to the TOTALSAL field. Don’t forget to validate the
expression for errors.

Step 7
Step 8: Now Run the Mapping by navigating to Run  Run Mapping in the Menu Bar. Then navigate to
“F:\Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is installed on my system, F:
Drive), and you can find your output target file. Open it up to see the output.

Step 8

Step 9: Then navigate to “F:\Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is


installed on my system, F: Drive), and you can find your output target files. Open them up to see the
output.
Output:

Expression Transformation was successful.

III CSE-H Page No: 11


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Aim: To apply an Aggregator Transformation on a Data Set to calculate Minimum and Maximum Salary
in the table.
Theory: Aggregator transformation is an active transformation used to perform calculations such as
sums, averages, counts on groups of data. The integration service stores the data group and row data in
aggregate cache. The Aggregator Transformation provides more advantages than the SQL, you can use
conditional clauses to filter rows.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.

Step 1

Step 2: Start up Informatica Services and Launch Informatica Developer.


Step 3: Import your flat files (both source and target files) into mirmo2 project.

Step 3

Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

III CSE-H Page No: 12


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 3
Step 4: Once the files have been imported, create a new mapping and name it Aggregator_Mapping.

Step 5: Drag the source (EMPSRC) table into the workspace (in read mode) and the target
(AGGREGATOR_TARGET) table (in write mode). Delete all the non-required entries from the tables. Also
drag the Aggregator Transformation from the palette. Drag and drop the values of the target table into
the Aggregator Transformation. Link up the input table and the output table via the Aggregator
Transformation appropriately. Add MIN_SAL, MAX_SAL and COUNT entries to the Transformation as
they are required for the evaluation.

Step 5

Step 6: Make sure that the Transformation is selected. Then navigate to Properties  Ports in the
Properties Bar. Unselect the Input Mark for MIN_SAL, MAX_SAL. This ensures that it can only act as an
output port after evaluation of the expression specified.

III CSE-H Page No: 13


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 6

Step 7: In the Expression Section of the MIN_SAL entry, click on the small arrow head to open up the
Expression Dialog Box, navigate to Functions  Aggregate and select MIN function. This function
checks for the least value among all the values and returns it. Then navigate to Ports and select SAL, so
that the expression reads “MIN(SAL)”. Don’t forget to validate the expression for errors.

Step 7

Step 8: Now Run the Mapping by navigating to Run  Run Mapping in the Menu Bar.

Step 9: Then navigate to “F:\Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is


installed on my system, F: Drive), and you can find your output target file. Open it up to see the output.

Output:

Aggregator Transformation was successful. Max and Min salaries for the table were obtained.

III CSE-H Page No: 14


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Aim: To apply Sorter Transformation on a Flat File DataBase to display Employee Salaries in Ascending
Order.
Theory: The sorter transformation is used to sort the data from relational or flat file sources. The sorter
transformation can also be used for case-sensitive sorting and can be used to specify whether the output
rows should be distinct or not.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.

Step 1

Step 2: Start up Informatica Services and Launch Informatica Developer.

Step 3: Create a new project (mirmo5) and import your flatfiles (both source and target files).

III CSE-H Page No: 15


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 3

Step 3

Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

Step 3

Step 4: Once the files have been imported, create a new mapping and name it Sorter_Mapping.

III CSE-H Page No: 16


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 5: Import the source file with read mode and target file as write mode and drag and drop the Sorter
Transformation from the Palette into the workspace. Give the necessary connections.

Step 5

Step 6: Select the Sorter Transformation box. Navigate to Properties  Ports in the Properties Tab and
place a check mark on the column that you want to be sorted as the key. I chose to sort the table in
ascending order based on the salary.

Step 6

III CSE-H Page No: 17


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 7: Now Run the Mapping. Then navigate to “F:\Informatica\PCExpress\tomcat\bin\target” (that’s
where Informatica is installed on my system, F: Drive), and you can find your output target file. Open it
up to see the output.

Output:

Output

Records have been sorted in ascending order based on the Salary of the employees.

III CSE-H Page No: 18


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Aim: To apply Rank Transformation on a Flat File DataBase to display the top student’s marks as per the
marks secured.
Theory: Rank transformation is an active and connected transformation. The rank transformation is used
to select the top or bottom rank of data. The rank transformation is used to select the smallest or largest
numeric/string values. The integration service caches the input data and then performs the rank
calculations.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.

Step 1

Step 2: Start up Informatica Services and Launch Informatica Developer.

Step 3: Import your flat files (both source and target files) into mirmo5 project.

Step 3

III CSE-H Page No: 19


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 3

Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

Step 3

III CSE-H Page No: 20


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 4: Once the files have been imported, create a new mapping and name it Rank_Mapping.

Step 5: Import the source file with read mode and target file as write mode and drag and drop the Rank
Transformation from the Palette into the workspace. Give the necessary connections.

Step 5

Step 6: Select the Rank Transformation box. Navigate to Properties  Ports in the Properties Tab and
place a check mark on the column that you want to be ranked as the key. I chose to rank the table based
on the marks secured by the student in one of the three subjects(m1).

Step 6

III CSE-H Page No: 21


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Step 7: You can choose whether to rank from the lowest or the highest. You can also choose the number
of ranks to display. These settings can be configured by navigating to Properties  Advanced in the
Properties Tab, when the Rank Transformation box is selected in the workspace.

Step 7

Step 8: Make sure that the connections are intact. Now Run the Mapping. Then navigate to “F:\
Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is installed on my system, F:
Drive), and you can find your output target file. Open it up to see the output.

Output:

Output

The topmost record based on m1 marks has been displayed.

III CSE-H Page No: 22


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Aim: To apply Joiner Transformation on a Flat File Database to perform Full Outer Join.
Theory: The joiner transformation provides you the option to create joins in Informatica. The joins
created using joiner transformation are similar to the joins in databases. The advantage of joiner
transformation is that joins can be created for heterogeneous systems (different databases).
Procedure:
Step 1: Create the following flat files Reserves(sid,bid,day), Sailors(sid,sname,age,rating) and populate
them with values.
Step 2: Start up Informatica Services and Launch Informatica Developer.
Step 3: Import your flat files into mirmo2 project.

Step 3

Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.

Step 3
Step 4: Once the files have been imported, create a new mapping and name it Joiner_Mapping.
Step 5: Drag all the files in read mode and place them in a vertical column.

III CSE-H Page No: 23


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 5
Step 6: Now drag joiner from the palette and load the tables into it. (Both tables must have a common
row)

Step 6
Step 7: Navigate to Properties  Join in the Properties Tab. Specify the join type. In this case, we use
Full Outer Join. And specify the join condition sid of Reserves = sid of Sailors i.e., Master SID = Detail
SID1.

Step 7.1

Navigate to Properties  Advanced of the Properties Tab and put a check mark to the Sorted Input
option so that the input is sorted.

Step 7.2

III CSE-H Page No: 24


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Step 8: Now connect the output ports to the input ports of the Target File.

Step 8
Step 9: Now Run the Mapping. Then navigate to “F:\Informatica\PCExpress\tomcat\bin\target” (that’s
where Informatica is installed on my system, F: Drive), and you can find your output target file. Open it
up to see the output.

Output:

Output
Full Outer Join has been successfully performed.

III CSE-H Page No: 25


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
CLEMENTINE

1. Using BASKETS1n dataset select the data as given below

a) Customer age < 35 and count the customers who buy dairy and VEG products

b) Find the AVG income of customers who buy atleast 5 products

c) Derive the field whose homeown is 'YES' and Age > 30 and sort data w.r.t. income in Ascending order, and
output only the item fields.

d) Find the mean value of salary w.r.t age={Young, Middle, Senior}.

Input data set is applicable to all exercises in given problem statement: BASKETS1n

SOLUTION 1a)

Expected output:

III CSE-H Page No: 26


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Output dataset:

Procedure:

1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.

2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.

3. Go to field options and select Derive flag and give condition as dairy = 'T' and cannedveg = 'T' and fruitveg = 'T'
and click OK based on the conditions the truth values are shown and records are selected.

III CSE-H Page No: 27


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

4. Goto options search for the Select give the condition as [ (age < 35) and DnV_T = 'T' ]

5. If both the conditions are true it counts the number of records.

6. Select the Aggregate operation to retrieve sum and max of the records.

III CSE-H Page No: 28


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

7. Select output table and connect with aggregator

8. Right click on output table and execute it.

9. Then click on run now output will be displayed.

SOLUTION 1b)

Expected output:

III CSE-H Page No: 29


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Output dataset:

Procedure:

1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.

2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.

3. Goto options search for the Select give the condition as as shown in figure.

4. If the condition is true it counts the number of records.

5. Select the Aggregate operation to retrieve income_Mean and count of the records.

III CSE-H Page No: 30


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

6. Select output table and connect with aggregator

7. Right click on output table and execute it.

8. Then click on run now output will be displayed.

SOLUTION 1c)

Expected output:

Output dataset:

III CSE-H Page No: 31


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Procedure:

1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.

2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.

3. Goto Field ops search for the Derive give the condition as as shown in figure.

4. Connect Derive to Select with the condition shown in figure

III CSE-H Page No: 32


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

5. Connect Select to Sort for sorting Income in ascending order

6. Connect Sort to Filter out all the non-item fields.

III CSE-H Page No: 33


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

7. Select output table and connect with Filter.

7. Right click on output table and execute it.

8. Then click on run now output will be displayed.

SOLUTION 1d)

Expected output:

Output dataset:

III CSE-H Page No: 34


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Procedure:

1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.

2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.

3. Goto Field ops search for the Binning give the condition as as shown in figure.

III CSE-H Page No: 35


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

4. Connect Binning to Type to read data types and values as shown in figure

5. Connect Type to Reclassify, for classifying age binned in 1,2,3 to Young,Middle,Senior respectively.

III CSE-H Page No: 36


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

6. Connect Reclassify to Aggregate to get the income_Mean w.r.t. Different age categories.

III CSE-H Page No: 37


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

7. Select output table and connect with Aggregate.

8. Right click on output table and execute it.

9. Then click on run now output will be displayed.

III CSE-H Page No: 38


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

2. Using DRUG3n and DRUG4n datasets select the data as given below

a) Select 50% of records where maximum type of drug are present along with no restrictions on remaining drugs,
and use histogram graph of age w.r.t BP

b) Take the equal number of samples of each drug and calculate the Std. Dev. of age w.r.t drug and compare it
with complete data Std. Dev. of age w.r.t drug and give a conclusion statement.

c) List 5 strong associations of attribute values, and derive and display the data.

d) Append DRUG2n dataset to given datasets and consider distinct values of Age.

e) Using the above 3 datasets (DRUG2n, DRUG3n, DRUG4n) perform the following

i) Young_Age <=30, Middle_Age >30 and <=50, Senior_Age >50

ii) Multi plot the above Age categories with Na and K and drug

Input data set is applicable to all exercises in given problem statement: DRUG3n and DRUG4n ( For excercises d
and e DRUG2n is also used)

III CSE-H Page No: 39


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 2a)

Expected output:

Output dataset/ graph:

III CSE-H Page No: 40


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Procedure: The following are the nodes used for this exercise with respective settings.

APPEND:

III CSE-H Page No: 41


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Aggregate: to get the count of all Drug types as shown in following output table.

Table: from the following output we can identify that ‘drugY’ has maximum number of records when compared to
remaining Drug types.

III CSE-H Page No: 42


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Balance: for selecting 50% of records for ‘drugY’

III CSE-H Page No: 43


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Histogram: The following Histogram node gives the output of Age w.r.t. BP

Table: It is an output of the records after selecting 50% of ‘drugY’ and no restrictions on remaining Drugs

III CSE-H Page No: 44


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 2b)

Expected output:

III CSE-H Page No: 45


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Output dataset:

When the above Sample data output and complete data output is compared Standard Deviation of Age w.r.t Each
drug type is almost similar, but there is a bit difference in Standard Deviation of Age w.r.t drugX, drugY and drugC
in Sample data where as in complete data Standard Deviation of Age for the above drug types has minor
difference.

Procedure: The following are the nodes used for this exercise with respective settings.

APPEND

III CSE-H Page No: 46


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SELECT and SAMPLE: This procedure is followed for remaining drug types types where 20 equal samples of each
drug type is selected

APPEND: Appending all samples

Sample Aggregate: Aggregate of Sample data

III CSE-H Page No: 47


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Complete Aggregate: Aggregate of Complete data

III CSE-H Page No: 48


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 2c)

Expected output:

III CSE-H Page No: 49


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Output dataset: The following is the output for Sex = ‘M’ and Cholesterol = ‘High’

Procedure: The following are the nodes used for this exercise with respective settings.

III CSE-H Page No: 50


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
APPEND

WEB: Plotting the web for Sex, BP, Cholesterol and Drug to get 5 strong associations

When web is created showing 5 strong links, we have to derive nodes for every links by right clicking on link and
generate derive node for link.

III CSE-H Page No: 51


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

When a derive node is created from link the following is the configuration of derive node

III CSE-H Page No: 52


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
SOLUTION 2d)

Expected output:

Output dataset: The following output is showing the records with distinct value of ages.

III CSE-H Page No: 53


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Procedure: The following are the nodes used for this exercise with respective settings.

Append:

Appended data sets are given in accordance with age are exported to output graphs.

Distinct Age:

III CSE-H Page No: 54


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 2e)

Expected output:

Output dataset/graph:

Above is Multi plot the above Young_Age categories with Na and K and drug

III CSE-H Page No: 55


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Above is Multi plot the above Middle_Age categories with Na and K and drug

Above is Multi plot the above Senior_Age categories with Na and K and drug

III CSE-H Page No: 56


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Procedure: The following are the nodes used for this exercise with respective settings.

Append:

Appended data sets are given in accordance with age are exported to output graphs.

Select Age = Young:

III CSE-H Page No: 57


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Select Age = Middle:

Select Age = Senior:

III CSE-H Page No: 58


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

For each Select of age we use Multi Plot as shown in output.

Exercise 3

Using BASKETS1N

a) Find the association rules only for items using Apriori model with minimum support 3% and confidence
90%.

III CSE-H Page No: 59


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

b) Compare the GRI and Aproiri having support 22% and confidence 90% usage(prepare a sample data set in
spreadsheet)

c) Determine the Drugs(Drug4n) importance w.r.t Age, Cholesterol and BP and Compare the C5.0 and Neural
Net

d) Determine the importance of the attributes using K-Means from Drug3n and Drug4n datasets

SOLUTION 3a)

Expected output:

Input data set:

III CSE-H Page No: 60


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Output dataset:

Procedure: The following are the nodes used for this exercise with respective settings.

Type : Using this node we read values and type of each attribute, the non-item attribute are given direction as
none and all Item based attributes are given as both input and output to the Apriori Model.

III CSE-H Page No: 61


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Model : The below diagram represents the settings for the Apriori Model.

As we execute this the Apriori Model is build, one we browse the model we can see the resultant rules as output.

III CSE-H Page No: 62


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 3b)

Expected output:

Input data set: An Excel file is prepared as shown below

III CSE-H Page No: 63


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Output dataset:

The below outputs shows that there is no difference in rules generated from the Apriori and GRI ( Generalized
Rule Induction ), but the order is changed. In Apriori, first low level frequent item sets rules are generated and
subsequently the next level frequent itemset rules. Whereas in GRI, the rules are generated on Items i.e. first
largest rule then smallest rule size for one item, then the same for subsequent items.

III CSE-H Page No: 64


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Procedure: The following are the nodes used for this exercise with respective settings.

Settings for Apriori

III CSE-H Page No: 65


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Settings for GRI

III CSE-H Page No: 66


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 3c)

Expected output:

Input data set:

III CSE-H Page No: 67


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Output dataset:

Output of Neural Net Model

III CSE-H Page No: 68


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Output of C5.0 Model

Output of C5.0 Model viewer which is shown in tree format

III CSE-H Page No: 69


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Comparision: Attribute Importance is as follows

Neural net Model : BP- 0.2928, Age- 0.1070, Cholestrol- 0.1023

C 5.0 : 2 Level Decision Tree is prepared with BP as root attribute (level 1) with Age and
Cholesterol Attributes at level 2

** As a result Neural Net and C 5.0 Models are giving the same information.

III CSE-H Page No: 70


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Procedure: The following are the nodes used for this exercise with respective settings.

Settings for C5.0 Model

III CSE-H Page No: 71


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Settings for Neural Net Model

III CSE-H Page No: 72


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

III CSE-H Page No: 73


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

SOLUTION 3d)

Expected output:

Input data set:

III CSE-H Page No: 74


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Output dataset:

The following output shows which attributes are important. (The unimportant attributes are Sex and Age)

III CSE-H Page No: 75


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Procedure: The following are the nodes used for this exercise with respective settings.

Append

III CSE-H Page No: 76


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

Setting for K-Means

III CSE-H Page No: 77


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
Weka

A ) To Open dataset follow :


Openfile/local disk C/Program files/weka3.8.6/data/weather.nominal

III CSE-H Page No: 78


Department of CSE DWDM Lab Record Roll Number: 21311A05XX
B) Next goto Choose select Filter next unsupervised and select instances.
Now Choose Removewithvalues and do the following steps.

Then click on OK and Then Apply.

III CSE-H Page No: 79


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

C) Open iris dataset from Openfile option in WEKA.

III CSE-H Page No: 80


Department of CSE DWDM Lab Record Roll Number: 21311A05XX

III CSE-H Page No: 81

You might also like