DWDM Complete Record
DWDM Complete Record
Step 2: Once the batch file runs its course, navigate to localhost:7009 in your favourite browser. This is
the login page for the Administrator Server. Login with the credentials that were supplied during the
install process.
Step 3: Once the server page starts up, we are ready to Launch Informatica Developer from the Search
Bar.
Theory: The filter transformation takes one table as an input and writes into another table all the values
that match a certain criterion. The tuples which do not meet the criteria are left as they are.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.
Step 1
Step 3: Import your flat files (both source and target files) into mirmo1 project. Click on Next and then
Next again. In the next dialog box that opens, check the “Import Column Names from the first line”
checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 3
Step 4: Once the files have been imported, create a new mapping and name it Filter_Mapping.
Step 5: Import the source file with read mode and target file as write mode and drag and drop the Filter
Transformation from the Palette into the workspace. Give the necessary connections.
Step 5
Step 6: Make sure that Filter is selected. Then navigate to Properties Filter and specify the condition
in the Filter Condition box. Click on Validate Expression at the right corner of the Properties Tab, to
check the validity of the Filter Condition.
Step 6
Step 7
Step 8: Make sure that the connections are intact. Now Run the Mapping. Then navigate to “F:\
Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is installed on my system, F:
Drive), and you can find your output target file. Open it up to see the output.
Output:
Theory: Filter Transformation only filters out tuples matching a certain criterion to only one table, but
the other tuples are left untouched. Router Transformation helps to route data based on some certain
criteria to multiple tables.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.
Step 1
Step 3: Import your flat files (both source and target files) into mirmo2 project.
Step 3
Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 4: Once the files have been imported, create a new mapping and name it Mapping_Router.
Step 5: Drag SOURCE table from Object Explorer into the Mapping_Router tab. Set Physical Data Object
Access to Read as we use this table only to take inputs. Then drag TGT1, TGT2, TGT3 tables and set
Physical Data Object Access to Write as we use these table to store the output of the router
transformation. Select Router transformation from the Palette and drag it onto the Mapping Tab. Select
all entries from Read_SOURCE and drop it onto the Router.
Step 5
Step 6: Make sure Router is selected in the Mapping_Router Tab. Navigate to Properties Groups in
the Properties Bar and click on New button to add a new group. Then in the GROUP FILTER CONDITION
for “Group” click on the small arrow in the corner and give the filter condition for the group. Create as
many groups as required.
Step 6
Step 7: Select “Group” (Ctrl + A) and map it to “Write_TGT1”. Similarly map “Group1” to “Write_TGT2”
and “Default” to “Write_TGT3”
Step 7
Step 8: Make sure Router is selected. Then click on Run in the menu bar and select Run Mapping.
Step 8
Output:
Step 1
Step 3
Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 4: Once the files have been imported, create a new mapping and name it Expression_Mapping.
Step 5: Drag the source (EMPSRC) table into the workspace (in read mode) and the target (EMPTGT1)
table (in write mode). Also drag the Expression Transformation from the palette.
Step 5
Step 6: Drag and drop the values of the target table into the Expression Transformation. Also link up the
input table and the output table via the Expression Transformation appropriately. Make sure that the
Transformation is selected. Then navigate to Properties Ports in the Properties Bar. Unselect the
Input Mark for TOTALSAL. This ensures that it can only act as an output port after evaluation of the
expression specified.
Step 6
Step 7: In the Expression Section of the entry, click on the small arrow head to open up the Expression
Dialog Box and enter the following expression: “IIF(ISNULL(COMM), SAL+COMM,
Step 7
Step 8: Now Run the Mapping by navigating to Run Run Mapping in the Menu Bar. Then navigate to
“F:\Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is installed on my system, F:
Drive), and you can find your output target file. Open it up to see the output.
Step 8
Aim: To apply an Aggregator Transformation on a Data Set to calculate Minimum and Maximum Salary
in the table.
Theory: Aggregator transformation is an active transformation used to perform calculations such as
sums, averages, counts on groups of data. The integration service stores the data group and row data in
aggregate cache. The Aggregator Transformation provides more advantages than the SQL, you can use
conditional clauses to filter rows.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.
Step 1
Step 3
Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 4: Once the files have been imported, create a new mapping and name it Aggregator_Mapping.
Step 5: Drag the source (EMPSRC) table into the workspace (in read mode) and the target
(AGGREGATOR_TARGET) table (in write mode). Delete all the non-required entries from the tables. Also
drag the Aggregator Transformation from the palette. Drag and drop the values of the target table into
the Aggregator Transformation. Link up the input table and the output table via the Aggregator
Transformation appropriately. Add MIN_SAL, MAX_SAL and COUNT entries to the Transformation as
they are required for the evaluation.
Step 5
Step 6: Make sure that the Transformation is selected. Then navigate to Properties Ports in the
Properties Bar. Unselect the Input Mark for MIN_SAL, MAX_SAL. This ensures that it can only act as an
output port after evaluation of the expression specified.
Step 7: In the Expression Section of the MIN_SAL entry, click on the small arrow head to open up the
Expression Dialog Box, navigate to Functions Aggregate and select MIN function. This function
checks for the least value among all the values and returns it. Then navigate to Ports and select SAL, so
that the expression reads “MIN(SAL)”. Don’t forget to validate the expression for errors.
Step 7
Step 8: Now Run the Mapping by navigating to Run Run Mapping in the Menu Bar.
Output:
Aggregator Transformation was successful. Max and Min salaries for the table were obtained.
Aim: To apply Sorter Transformation on a Flat File DataBase to display Employee Salaries in Ascending
Order.
Theory: The sorter transformation is used to sort the data from relational or flat file sources. The sorter
transformation can also be used for case-sensitive sorting and can be used to specify whether the output
rows should be distinct or not.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.
Step 1
Step 3: Create a new project (mirmo5) and import your flatfiles (both source and target files).
Step 3
Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 4: Once the files have been imported, create a new mapping and name it Sorter_Mapping.
Step 5
Step 6: Select the Sorter Transformation box. Navigate to Properties Ports in the Properties Tab and
place a check mark on the column that you want to be sorted as the key. I chose to sort the table in
ascending order based on the salary.
Step 6
Output:
Output
Records have been sorted in ascending order based on the Salary of the employees.
Aim: To apply Rank Transformation on a Flat File DataBase to display the top student’s marks as per the
marks secured.
Theory: Rank transformation is an active and connected transformation. The rank transformation is used
to select the top or bottom rank of data. The rank transformation is used to select the smallest or largest
numeric/string values. The integration service caches the input data and then performs the rank
calculations.
Procedure:
Step 1: Create the source and target flat files (text files), with the first row being column name and
subsequent rows being records, separated by ‘,’ as a delimiter.
Step 1
Step 3: Import your flat files (both source and target files) into mirmo5 project.
Step 3
Step 3
Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 5: Import the source file with read mode and target file as write mode and drag and drop the Rank
Transformation from the Palette into the workspace. Give the necessary connections.
Step 5
Step 6: Select the Rank Transformation box. Navigate to Properties Ports in the Properties Tab and
place a check mark on the column that you want to be ranked as the key. I chose to rank the table based
on the marks secured by the student in one of the three subjects(m1).
Step 6
Step 7
Step 8: Make sure that the connections are intact. Now Run the Mapping. Then navigate to “F:\
Informatica\PCExpress\tomcat\bin\target” (that’s where Informatica is installed on my system, F:
Drive), and you can find your output target file. Open it up to see the output.
Output:
Output
Aim: To apply Joiner Transformation on a Flat File Database to perform Full Outer Join.
Theory: The joiner transformation provides you the option to create joins in Informatica. The joins
created using joiner transformation are similar to the joins in databases. The advantage of joiner
transformation is that joins can be created for heterogeneous systems (different databases).
Procedure:
Step 1: Create the following flat files Reserves(sid,bid,day), Sailors(sid,sname,age,rating) and populate
them with values.
Step 2: Start up Informatica Services and Launch Informatica Developer.
Step 3: Import your flat files into mirmo2 project.
Step 3
Click on Next and then Next again. In the next dialog box that opens, check the “Import Column Names
from the first line” checkbox and make sure the delimiter selected is ‘,’. Then click on Finish.
Step 3
Step 4: Once the files have been imported, create a new mapping and name it Joiner_Mapping.
Step 5: Drag all the files in read mode and place them in a vertical column.
Step 5
Step 6: Now drag joiner from the palette and load the tables into it. (Both tables must have a common
row)
Step 6
Step 7: Navigate to Properties Join in the Properties Tab. Specify the join type. In this case, we use
Full Outer Join. And specify the join condition sid of Reserves = sid of Sailors i.e., Master SID = Detail
SID1.
Step 7.1
Navigate to Properties Advanced of the Properties Tab and put a check mark to the Sorted Input
option so that the input is sorted.
Step 7.2
Step 8: Now connect the output ports to the input ports of the Target File.
Step 8
Step 9: Now Run the Mapping. Then navigate to “F:\Informatica\PCExpress\tomcat\bin\target” (that’s
where Informatica is installed on my system, F: Drive), and you can find your output target file. Open it
up to see the output.
Output:
Output
Full Outer Join has been successfully performed.
a) Customer age < 35 and count the customers who buy dairy and VEG products
c) Derive the field whose homeown is 'YES' and Age > 30 and sort data w.r.t. income in Ascending order, and
output only the item fields.
Input data set is applicable to all exercises in given problem statement: BASKETS1n
SOLUTION 1a)
Expected output:
Output dataset:
Procedure:
1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.
2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.
3. Go to field options and select Derive flag and give condition as dairy = 'T' and cannedveg = 'T' and fruitveg = 'T'
and click OK based on the conditions the truth values are shown and records are selected.
4. Goto options search for the Select give the condition as [ (age < 35) and DnV_T = 'T' ]
6. Select the Aggregate operation to retrieve sum and max of the records.
SOLUTION 1b)
Expected output:
Procedure:
1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.
2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.
3. Goto options search for the Select give the condition as as shown in figure.
5. Select the Aggregate operation to retrieve income_Mean and count of the records.
SOLUTION 1c)
Expected output:
Output dataset:
Procedure:
1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.
2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.
3. Goto Field ops search for the Derive give the condition as as shown in figure.
SOLUTION 1d)
Expected output:
Output dataset:
Procedure:
1. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path
is shown once you have selected a file, and its contents are displayed with delimiters in the panel below it.
2. Select var.file from sources then goto C:\Program Files (x86)\SPSS Clementine\11.1\Demos\BASKETS1n we get
the baskets in file.
3. Goto Field ops search for the Binning give the condition as as shown in figure.
4. Connect Binning to Type to read data types and values as shown in figure
5. Connect Type to Reclassify, for classifying age binned in 1,2,3 to Young,Middle,Senior respectively.
6. Connect Reclassify to Aggregate to get the income_Mean w.r.t. Different age categories.
2. Using DRUG3n and DRUG4n datasets select the data as given below
a) Select 50% of records where maximum type of drug are present along with no restrictions on remaining drugs,
and use histogram graph of age w.r.t BP
b) Take the equal number of samples of each drug and calculate the Std. Dev. of age w.r.t drug and compare it
with complete data Std. Dev. of age w.r.t drug and give a conclusion statement.
c) List 5 strong associations of attribute values, and derive and display the data.
d) Append DRUG2n dataset to given datasets and consider distinct values of Age.
e) Using the above 3 datasets (DRUG2n, DRUG3n, DRUG4n) perform the following
ii) Multi plot the above Age categories with Na and K and drug
Input data set is applicable to all exercises in given problem statement: DRUG3n and DRUG4n ( For excercises d
and e DRUG2n is also used)
SOLUTION 2a)
Expected output:
Procedure: The following are the nodes used for this exercise with respective settings.
APPEND:
Aggregate: to get the count of all Drug types as shown in following output table.
Table: from the following output we can identify that ‘drugY’ has maximum number of records when compared to
remaining Drug types.
Histogram: The following Histogram node gives the output of Age w.r.t. BP
Table: It is an output of the records after selecting 50% of ‘drugY’ and no restrictions on remaining Drugs
SOLUTION 2b)
Expected output:
When the above Sample data output and complete data output is compared Standard Deviation of Age w.r.t Each
drug type is almost similar, but there is a bit difference in Standard Deviation of Age w.r.t drugX, drugY and drugC
in Sample data where as in complete data Standard Deviation of Age for the above drug types has minor
difference.
Procedure: The following are the nodes used for this exercise with respective settings.
APPEND
SELECT and SAMPLE: This procedure is followed for remaining drug types types where 20 equal samples of each
drug type is selected
SOLUTION 2c)
Expected output:
Output dataset: The following is the output for Sex = ‘M’ and Cholesterol = ‘High’
Procedure: The following are the nodes used for this exercise with respective settings.
WEB: Plotting the web for Sex, BP, Cholesterol and Drug to get 5 strong associations
When web is created showing 5 strong links, we have to derive nodes for every links by right clicking on link and
generate derive node for link.
When a derive node is created from link the following is the configuration of derive node
Expected output:
Output dataset: The following output is showing the records with distinct value of ages.
Append:
Appended data sets are given in accordance with age are exported to output graphs.
Distinct Age:
SOLUTION 2e)
Expected output:
Output dataset/graph:
Above is Multi plot the above Young_Age categories with Na and K and drug
Above is Multi plot the above Middle_Age categories with Na and K and drug
Above is Multi plot the above Senior_Age categories with Na and K and drug
Procedure: The following are the nodes used for this exercise with respective settings.
Append:
Appended data sets are given in accordance with age are exported to output graphs.
Exercise 3
Using BASKETS1N
a) Find the association rules only for items using Apriori model with minimum support 3% and confidence
90%.
b) Compare the GRI and Aproiri having support 22% and confidence 90% usage(prepare a sample data set in
spreadsheet)
c) Determine the Drugs(Drug4n) importance w.r.t Age, Cholesterol and BP and Compare the C5.0 and Neural
Net
d) Determine the importance of the attributes using K-Means from Drug3n and Drug4n datasets
SOLUTION 3a)
Expected output:
Procedure: The following are the nodes used for this exercise with respective settings.
Type : Using this node we read values and type of each attribute, the non-item attribute are given direction as
none and all Item based attributes are given as both input and output to the Apriori Model.
As we execute this the Apriori Model is build, one we browse the model we can see the resultant rules as output.
SOLUTION 3b)
Expected output:
Output dataset:
The below outputs shows that there is no difference in rules generated from the Apriori and GRI ( Generalized
Rule Induction ), but the order is changed. In Apriori, first low level frequent item sets rules are generated and
subsequently the next level frequent itemset rules. Whereas in GRI, the rules are generated on Items i.e. first
largest rule then smallest rule size for one item, then the same for subsequent items.
SOLUTION 3c)
Expected output:
Output dataset:
C 5.0 : 2 Level Decision Tree is prepared with BP as root attribute (level 1) with Age and
Cholesterol Attributes at level 2
** As a result Neural Net and C 5.0 Models are giving the same information.
Procedure: The following are the nodes used for this exercise with respective settings.
SOLUTION 3d)
Expected output:
Output dataset:
The following output shows which attributes are important. (The unimportant attributes are Sex and Age)
Procedure: The following are the nodes used for this exercise with respective settings.
Append