[go: up one dir, main page]

0% found this document useful (0 votes)
11 views40 pages

DS324SVR Lab Exercises

The document provides a comprehensive guide for performing exercises in DataStage Essentials Server Edition, covering various modules such as Administrator, Manager, Designer, and others. Each module includes tasks for logging in, creating projects, managing categories, and using different stages and features within DataStage. The exercises are structured to enhance the user's understanding and practical skills in utilizing DataStage effectively.

Uploaded by

Gabo Lopez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views40 pages

DS324SVR Lab Exercises

The document provides a comprehensive guide for performing exercises in DataStage Essentials Server Edition, covering various modules such as Administrator, Manager, Designer, and others. Each module includes tasks for logging in, creating projects, managing categories, and using different stages and features within DataStage. The exercises are structured to enhance the user's understanding and practical skills in utilizing DataStage effectively.

Uploaded by

Gabo Lopez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 40

DataStage Essentials

Server Edition
(DS324SVR)

Lab Exercises

Copyright © 2003 Ascential Software, Inc.


2 DataStage Essentials Server Edition

Table of Contents

Module 2 Exercise: Administrator..................5


Task 2-1: Login to DataStage Administrator......................................5
Task 2-2: Create a new project..........................................................5
Task 2-3: Enable project properties...................................................5
Task 2-4: Set the schedule user account (Windows only)..................6
Module 3 Exercise: Manager..........................7
Assumptions:.......................................................................................7
Task 3-1: Login to DataStage Manager..............................................7
Task 3-2: Create a category...............................................................7
Task 3-3: Import a DataStage export file...........................................7
Task 3-4: Move objects between categories......................................7
Task 3-5: Perform usage analysis......................................................8
Module 4 Exercise: Designer.........................9
Assumptions:.......................................................................................9
Task 4-1: Login to DataStage Designer..............................................9
Task 4-2: Customize the Designer’s layout and palette.....................9
Task 4-3: Create a parameter in a job...............................................9
Task 4-4: Insert and link stages.......................................................10
Task 4-5: Use a parameter in a job..................................................10
Task 4-6: Save, compile, and run a job............................................11
Module 5 Exercise: Director.........................12
Assumptions:.....................................................................................12
Task 5-1: Run a job..........................................................................12
Task 5-2: Monitor a job.....................................................................12
Task 5-3: View a job log...................................................................12
Task 5-4: Schedule a job..................................................................12
Module 6 Exercise: ODBC/Relational Stages..14
Assumptions:.....................................................................................14
Task 6-1: Establish a Windows ODBC connection............................14
Task 6-2: Import ODBC Table Definitions.........................................14
Task 6-3: Create a job......................................................................14
Task 6-4: Setup the ODBC stage for an inner join............................16
Task 6-5: Setup ODBC stage to create and load a table..................16
Task 6-6: Use an ODBC stage to aggregate data.............................17
Module 7 Exercise: Intelligent Assistants.....19
Assumptions:.....................................................................................19
Task 7-1: Prepare the job template..................................................19
Task 7-2: Make a new template from a job......................................19
Task 7-3: Make a new job from a template......................................19
Task 7-4: Setup the srcCustomers job.............................................20
Task 7-5: Setup the remaining src jobs............................................20

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 3

Module 8 Exercise: Hashed Files..................22


Assumptions:.....................................................................................22
Task 8-1: Add hashed file stages to an existing job.........................22
Module 9 Exercise: Transforming Data.........24
Assumptions:.....................................................................................24
Task 9-1: Create a new template from a job....................................24
Task 9-2: Create a new job from a template....................................24
Task 9-3: Import metadata from the hashed files............................25
Task 9-4: Setup the source stage.....................................................25
Task 9-5: Setup the hashed file stages............................................26
Task 9-6: Setup the constraints.......................................................26
Task 9-7: Define lookup expression and output column derivations27
Task 9-8: Setup the target ODBC stage...........................................28
Task 9-9: Repeat previous steps for the srcProducts job.................28
Task 9-10: Create the trnOrders job.................................................28
Module 10 Exercise: DataStage BASIC..........31
Assumptions:.....................................................................................31
Task 10-1: Create the order detail tables.........................................31
Module 11 Exercise: Repository Objects.......33
Assumptions:.....................................................................................33
Task 11-1: Examine date formats and locate SDK transforms.........33
Task 11-2: Modify dates using SDK transforms................................33
Module 12 Exercise: Performance Enhancements
34
Assumptions:.....................................................................................34
Task 12-1: Compare processing times with in-process row buffering
enabled.............................................................................................34
Task 12-2: Compare processing times with inter-process row buffering
enabled.............................................................................................34
Task 12-3: Use an IPC stage to create multiple processes...............34
Task 12-4: Partition and collect records in a job..............................35
Module 13 Exercise: Job Sequencer..............37
Assumptions:.....................................................................................37
Task 13-1: Create a sequence to control the source jobs................37
Task 13-2: Create a sequence to control the lookup job..................37
Task 13-3: Create a sequence to control the dimension table
transformation jobs...........................................................................38
Task 13-4: Create a sequence to control the fact table transformation job
..........................................................................................................39
Task 13-5: Create a sequence that controls the transformation sequences
..........................................................................................................39
Task 13-6: Create the master sequence that controls all of the sequences
..........................................................................................................40

Copyright © 2003 Ascential Software, Inc.


4 DataStage Essentials Server Edition

Task 13-7: Run the master sequence...............................................40

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 5

Module 2 Exercise: Administrator


Assumptions:
 DataStage client is installed on the student’s PC.

Task 2-1: Login to DataStage Administrator


Click Start, point to Programs, then point to Ascential DataStage, and click
DataStage Administrator. Complete the form as follows:

If the DataStage server is installed on UNIX:


1. Enter the server name or IP address of server as provided by the instructor as
the Host System.
2. Enter “dsadm” as User.
3. Enter a password as provided by the instructor.

If the DataStage server is installed on Windows:


1. Enter “localhost” as the Host System.
2. Enter an account name that has administrator rights as User, which will be
provided by the instructor.
3. Enter a password as provided by the instructor.

Task 2-2: Create a new project


Create a new project named DS324.

1. On the Projects page click Add.


2. Enter DS324 as the new project Name.
3. For Location:
If the DataStage server is installed on UNIX:
Enter /app/dstage/projects/DS324.

If the DataStage server is installed on Windows:


Accept the default directory.

4. Click OK.

Task 2-3: Enable project properties


Enable the following properties: Enable job administration in Director and Auto-
purge of job log. Set the Auto-purge action to up to the previous 3 job runs.

1. On the Projects page select the DS324 project.


2. Click Properties.
3. Check Enable job administration in Director and Auto-purge of job log.
4. Click the Up to previous radio button in the Auto-purge action box.
5. Increase the job run(s) to 3.

Copyright © 2003 Ascential Software, Inc.


6 DataStage Essentials Server Edition

Task 2-4: Set the schedule user account (Windows only)


Set the schedule user account as a user with local administrator privileges.

1. Click the Schedule tab.


2. Enter a valid username as provided by the intructor in the Username box.
3. Enter the password for that user in the Password box.
4. Click the Test button.
5. If the account fails the validation, select another username and repeat steps 1-4.
If the account validates successfully, move to the next step.
6. Click OK.
7. Click OK.
8. Click Close to close DataStage Administrator.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 7

Module 3 Exercise: Manager


Assumptions:
 The DS324 project exists.
 Student has a work directory on the DataStage server with the data files in it.

Task 3-1: Login to DataStage Manager


Click Start, point to Programs, then point to Ascential DataStage, and click
DataStage Manager. Login with student account information provided by the
instructor and select the DS324 project from the Project drop-down list.

Task 3-2: Create a category


Create the DS324Warehouse category under the Jobs branch.

1. Select Jobs, right click, and choose New Category…


2. Enter DS324 Warehouse as the Current category.
3. Click OK.

Task 3-3: Import a DataStage export file


Import the DS324SvrProjectFiles.dsx export file into the project.

1. Click Import from the main toolbar and choose DataStage Components.
2. In the Import from file: box, browse to the location of the
DS324SvrProjectFiles.dsx file. It will be located in the directory where the data
files were copied.
3. Make sure that the Import All radio button and Perform Usage Analysis box
are checked.

Task 3-4: Move objects between categories


Move jobs into the required category. Main jobs will be moved into the DS324
Warehouse category, while a utility job will be moved into the Warehouse Utilities
category.

1. Select the Jobs branch on the left hand pane of the window.
2. Select the ClearWarehouseTables job from the right hand pane.
3. Drag the job on top of the Warehouse Utilities category in the left hand pane.
4. Select all of the remaining jobs on the right hand pane.
5. Drag the selection to DS324 Warehouse category on the left hand pane.

Task 3-5: Perform usage analysis


Use the Usage Analysis tool to see which objects are referenced by the Employees
table definition.

1. Expand the Table Definitions branch on the left hand pane.


2. Expand the Sequential category and then the DS324Files category.

Copyright © 2003 Ascential Software, Inc.


8 DataStage Essentials Server Edition

3. Right click on the Employees table definition on the right hand pane and click
Usage Analysis.
4. Which objects are referenced by this table definition?

5. ___lkpEmployeBonus___________________________________

___seqRecentEmployess__________________________________

__seqRecentEmployees_____________________________

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 9

Module 4 Exercise: Designer


Assumptions:
 Student has imported DS324SvrProjectFiles.dsx into the DS324 project.

Task 4-1: Login to DataStage Designer


You can login to the Designer with the same user information that was used in the
previous exercise, or you can open the Designer from the Tools menu of the
Manager.

1. Click Start, point to Programs, then point to Ascential DataStage, and


click DataStage Designer.
2. Supply the username and password of the account that was used in the
previous exercise.
To login from the Manager:
1. Click Tools from the main menu and click Run Designer.

Task 4-2: Customize the Designer’s layout and palette


Customize the Designer’s layout and palette by arranging the toolbars and windows
and moving some frequently accessed stages to the Favorites group in the palette.

1. Create a new Server job to open all of the display windows.


2. Arrange the windows and toolbars in the display area in the Designer as
necessary. Remember that many of the windows within the Designer are
floating and can be attached to either side or top or bottom. Be sure to leave
the Repository and Palette windows open.
3. Right click in the Palette, point to Customization, and click Customize.
4. In Repository Items located in the upper left pane expand Stage Types,
expand Server, and then expand Database.
5. From the Database category, drag the ODBC stage into the Favorites
category of Current Palette groups and shortcut items in the right
panel. Contract the Database category.
6. From the File category under Server of the upper left hand pane, drag the
Sequential File stage and then the Hashed File stage into the Favorites
category of the right hand pane. Contract the File category.
7. From the All category of the upper left hand pane locate the Transformer
stage and drag it into the Favorites category of the right hand pane.

Task 4-3: Create a parameter in a job


Create a parameter in the seqRecentEmployees job.

1. Click on the Open icon on the Standard Toolbar.


2. Open the seqRecentEmployees job from within the DS324 Warehouse
category.
3. Click on the job Properties icon of the Standard Tool Bar or Edit from the
File menu then Job Properties.
4. Click on the Parameters tab.
5. Click into the Parameter name field of the next blank row and enter
FILE_DIR.
6. Enter an appropriate prompt in the Prompt field.

Copyright © 2003 Ascential Software, Inc.


10 DataStage Essentials Server Edition

7. Select Pathname from the Type drop-down list.


8. Enter a value in the Default Value field as provided by the instructor.
9. Click OK.

Task 4-4: Insert and link stages


Insert a sequential stage into the job as the target stage and reattach the hanging
link.

1. Click on the Favorites category in the palette.


2. Drag the Sequential File stage on to the canvas locating it over the output
link from the Transformer. Ensure that the link changes from red to black
indicating that the link has been attached. If the link stays red, drag the end
of the link on to the Sequential file stage.
3. Right click on the new stage and choose Rename.
4. Change the stage name to seqRecentEmployees.
5. The job should look like the screen shot in Fig. 4-1 at this point.

Fig. 4-1

Task 4-5: Use a parameter in a job


Insert the FILE_DIR parameter into the source stage. Load the Employees table
definition into the columns tab. View the data to verify that the file exists and
parameters and metadata are correct.

1. Double-click on the seqEmployees source stage.


2. Click on the Outputs tab.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 11

3. Replace <enter parameter for file path here> with #FILE_DIR# in the
File name field.
4. Click on the Columns tab.
5. Click Load.
6. Expand DS324Files under the Sequential branch and select the
Employees table definition.
7. Click OK.
8. Click View Data. If an error message appears, note the error and check the
setup of the stage. Then try to view data again.
9. Click OK to accept the parameter values.
10. Verify the data is readable.
11. Click Close.
12. Click OK to return to the stage properties.
13. Click OK to accept the changes to the stage and return to the canvas.

Use the FILE_DIR parameter to setup the target stage.

1. Double-click on the seqRecentEmployees sequential stage.


2. On the Input tab, enter #FILE_DIR#\RecentEmployees.txt as the File
Name.
3. Click OK.

Task 4-6: Save, compile, and run a job


Save the job and compile it. Upon successful compilation, run the job from within
the Designer.

1. Click the Save button on the Standard toolbar.


2. Click the Compile button on the Standard toolbar. If an error appears, note
the error and check the setup of the job to resolve any issues.
3. When the compilation is completed successfully click Close.
4. Click the Run button on the Standard toolbar.
5. Click Run from the Job Run Options dialog box.
6. When the job run has finished, note below how many records were sent to
the target stage.

___________________________

Copyright © 2003 Ascential Software, Inc.


12 DataStage Essentials Server Edition

Module 5 Exercise: Director


Assumptions:
 The DS324 project has been created.
 The DS324SvrProjectFiles.dsx file has been imported into the DS324 project.

Task 5-1: Run a job


From the Tools menu in the Designer, login to the Director. Locate and run the
hashCreateJOBS job.

1. Click Tools from the Main Tool Bar and select Run Director.
2. Select hashCreateJOBS job in the right hand pane.
3. Click the Run button from the Standard Tool Bar.
4. Supply any empty parameter values.
5. Click Run in the Job Run Options dialog box.

Task 5-2: Monitor a job


Open a monitor and note the number of records that were loaded into the target file.

1. Select hashCreateJOBS.
2. Click Tools from the Main Toolbar and select New Monitor.
3. Expand xFilter and note the number of records that were loaded into the
target file.

_____________________
4. Click Close.

Task 5-3: View a job log


Open the log for the job and locate the entry detailing the creation of the hashed
file.

1. Click on the Log icon of the Standard Toolbar.


2. Locate the event detail that references the creation of a file.
3. Note the name of the hashed file and what type of file it is.

_____________________

Task 5-4: Schedule a job


Return to the schedule view and locate the lkpEmployeeBonus job. Schedule the
job to run in 30 minutes. Supply any necessary parameter values.

1. Click on the Status button of the Standard Toolbar.


2. Right click on lkpEmployeeBonus and select Add to Schedule.
3. Check Today for the Run Job type.
4. Set the Time to 30 minutes from now.
5. Click OK.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 13

6. Supply any necessary parameter values.


7. Click Schedule.

Copyright © 2003 Ascential Software, Inc.


14 DataStage Essentials Server Edition

Module 6 Exercise: ODBC/Relational Stages


Assumptions:
 The DS324.mdb Access database exists in the DS324Files directory.
 Employees and Jobs tables will already be loaded into the DS324 database.

Task 6-1: Establish a Windows ODBC connection


Create a data source name and define the ODBC connection to the DS324 database.
This DSN will be used for this and future labs.

1. Click the Start button on the Windows taskbar and click Run…
2. In the Open box, type odbcad32.exe.
3. Click OK. The ODBC Data Source Administrator dialog box appears.
4. Click on the System DSN tab.
5. Click Add…
6. Select the Microsoft Access Driver from the list and click Finish.
7. Enter DS324 as the Data Source Name.
8. Click Select…
9. Browse to the DS324Files directory and select ds324.mdb as the database.
10. Click OK to accept the new database.
11. Click OK to accept the new data source.
12. Click OK to close the ODBC Administrator.

Task 6-2: Import ODBC Table Definitions


Import metadata from the Employees and Jobs tables in the DS324 database into the
project repository as table definitions.

1. Login in to DataStage Designer.


2. Right click on Table Definitions in the Repository, point to Import and
click on ODBC Table Definitions.
3. Select DS324 from the DSN drop down list.
4. Click OK on the Import Meta Data (ODBC) window.
5. Select all of the tables from the list.
6. Click Import to create the table definitions and be returned to the Designer
window.

Task 6-3: Create a job


Create a job to select recent employees based on the same criteria (hire date) as in
the previous lab.

1. Click the New icon on the Standard Toolbar to create a new object.
2. Select New Server Job.
3. Click OK.
4. Click on the Favorites category of the Palette.
5. Drag the ODBC stage to the canvas twice, one for a source and one for a
target.
6. Drag the Transformer stage to the canvas.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 15

7. Either use the link stage in the General category of the Palette or right click
in the source stage and drag a link to the Transformer. Then create a link
from the Transformer to the target stage.
8. Label stages and links as below in Fig. 6-1:

Fig. 6-1

9. Click the Save button on the Standard Toolbar.


10. Save the job as odbcEmployeeBonus.

Copyright © 2003 Ascential Software, Inc.


16 DataStage Essentials Server Edition

Task 6-4: Setup the ODBC stage for an inner join

1. Double-click on the odbcRecentEmployees stage to open the Properities


page.
2. Select DS324 from the Data source name drop down list.
3. Click on Get SQLInfo.
4. Click on the Outputs tab.
5. Select Employees from the Available tables drop down list and click Add.
6. Select Jobs from the Available tables drop down list and click Add.
7. Click the Columns tab.
8. Click Load.
9. Select the Employees table definition.
10. Click OK.
11. Select the job_id, job_lvl, pub_id, and hire_date columns from the Selected
columns box.
12. Click the Remove columns button (<).
13. Click OK.
14. Double-click in the row below lname in the Column name field.
15. Enter the following information for the new column’s metadata:
Column Name: bonus
Derivation: max_lvl * 10
SQL Type: Integer
Length: 10
16. Click the Selection tab.
17. Enter Employees.Job_ID = Jobs.Job_ID and Employees.Hire_Date>='90-01-01'
in the Where clause field.
18. Click View Data to verify data.
19. Click Close.
20. Click OK.
21. Right click on xfmMap, point to Propagate Columns and select
RecentEmployeesIn.
22. Click on RecentEmployeesOut to map columns from the input to the output
of the transformer stage.

Task 6-5: Setup ODBC stage to create and load a table

1. Double click on the odbcREC_EMP_BONUS target stage.


2. Select DS324 from the Data source name field.
3. Click Get SQLInfo.
4. Click on the Inputs tab.
5. Enter REC_EMP_BONUS in the Table name field.
6. From the Update action drop-down list, select Clear table then insert
rows.
7. Check Create table in target database.
8. Click the Edit DDL tab.
9. Click Create DDL which will generate the CREATE statement.
10. Click on the Transaction Handling tab.
11. Set Rows per transaction to 25.
12. Click OK.
13. Click the Save button from the Standard Toolbar.
14. Click the Compile button.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 17

15. Click the Run button.


16. Click Run on the Job Run Options window.
17. Verify that the number of records loaded is the same number of records
loaded in Exercise 5.
18. When the job has successfully completed, import the metadata from the new
table as a Table Definition in your repository.

Task 6-6: Use an ODBC stage to aggregate data


Use the record set (REC_EMP_BONUS table) to aggregate the total dollar amount of
the bonuses and average dollar amount of bonus per employee and load records
into another table (REC_EMP_BONUS_TOTALS).

1. Create a new server job.


2. Insert two ODBC stages on the canvas and link them together.
3. Label the stages and links as shown below in Fig. 6-2.

Fig. 6-2

4. Double-click on the odbcREC_EMP_BONUS stage to open the Properities.


5. Select DS324 from the Data source name drop down list on the Stage
page.
6. Click on Get SQLInfo.
7. Click on the Outputs tab.
8. On the General page, select REC_EMP_BONUS from the Available tables
drop down list and click Add.
9. Click the Columns tab.
10. Create two new columns, entering values described below:
Column Name: total_bonus

Copyright © 2003 Ascential Software, Inc.


18 DataStage Essentials Server Edition

Derivation: sum(bonus)
SQL type: Double
Length: 15

Column name: avg_bonus


Derivation: avg(bonus)
SQL type: Double
Length: 15

11. Click OK.


12. Double click on the odbcREC_EMP_BONUS_TOTALS target stage.
13. Select DS324 from the Data source name field.
14. Click Get SQLInfo.
15. Click on the Inputs tab.
16. Enter REC_EMP_BONUS_TOTALS in the Table name field.
17. Select Insert rows without clearing from the Update action field.
18. Check Create table in target database.
19. Click the Edit DDL tab.
20. Click Create DDL which will populate the CREATE statement field.
21. Click OK.
22. Click the Save button from the Standard Toolbar.
23. Name the job odbcEmployeeBonusTotals.
24. Click the Compile button.
25. Click the Run button.
26. Click Run on the Job Run Options window.
27. Note the values generated.

_________________________________________________

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 19

Module 7 Exercise: Intelligent Assistants


Assumptions:
 The srcTemplate exists in the DS324 warehouse project.
 The raw data files are located in the \DS324Files\Raw directory.
 The \DS324Files\Source directory exists.

Task 7-1: Prepare the job template


Login to the Designer and open the srcTemplate job. Examine the layout of the
job. Locate the source files that will be used and examine the structure of each file.
Modify the layout in each of the sequential file stages to match the layout of the
source files. Save and close the jobs.

1. View the raw data files on the operating system and note the layout of
the files.
2. If not already logged into the DataStage Designer, login.
3. Open the srcTemplate job.
4. Open the source Sequential stage.
5. Go to the Output tab.
6. Modify the format to match the file layout.
7. Click the Save on the Standard toolbar.
8. Close the job.

Task 7-2: Make a new template from a job


Use the Intelligent Assistant to create a template from a job. The template will be
used later to create new ‘skeleton’ jobs.

1. From the File menu, click New.


2. Select New Template from Job.
3. Use the srcTemplate job as the job to create a template from.
4. Name the new template Source Job Template and create it in the
Template Category of DS324 Warehouse Templates.
5. Optionally, enter a description for the template.

Task 7-3: Make a new job from a template


Use the Intelligent Assistant to create multiple jobs from the srcTemplate template.
The jobs will serve as the preliminary jobs in the warehouse load sequence.

1. From the File menu, click New.


2. Select New Job from Template.
3. Select Source Job Template.
4. Name the new job srcCustomers with a Job Category of DS324
Warehouse\SRC.

Task 7-4: Setup the srcCustomers job


Fill in the missing properties and values from the newly created srcCustomers job.

Copyright © 2003 Ascential Software, Inc.


20 DataStage Essentials Server Edition

1. Modify the stage names as shown in Fig. 7-1.

Fig. 7-1

2. Open the properties page of the source stage.


3. Click on the Outputs tab.
4. Replace filename with the raw data file name
(CUSTOMERS.980501.D01).
5. Load the CUSTOMERS_DETAIL table definition from the COBOL FD
category on the Columns page.
6. Close the properties page.
7. Right click on xfmFilter Transformer, point to Propagate columns,
point to the input link, and click on the output link.
8. Double-click target stage to setup the target stage’s properties.
9. Replace filename with the source data file name (Customers.src).
10. Leave the format of the file as it was set from the template. The file
should be written out with a pipe (|) delimiter and no quote character
(000).
11. Load the Customers table definition from the DS324Files category.
12. Close the properties page.
13. Open the xfmMap Transformer and click on the Column Auto-Match
button. Choose the location match. Verify that all of the column names
match up correctly.
14. Remember that since the source records are fixed-width, they must be
trimmed in order to be able to load them into the table correctly. In each
of the derivation cells of the output columns, you must apply the Trim
function to the selected input column in the derivation. The derivation
would look similar to this:
Trim(link_name.column_name)
15. Save, compile and run the job.

Task 7-5: Setup the remaining src jobs


Repeat the above process for the remaining 2 source streams - orders and products.
The file names and table definitions for the jobs are as follows:

Job: srcProducts
Raw data directory: \DS324Files\Raw

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 21

Raw data file name: PRODUCTS.980501.D01


Table definition: COBOL FD\PRODUCT_DETAIL
Source file directory: \DS324Files\Source
Source file name: Products.src
Table definition: Products

Job: srcOrders
Raw data directory: \DS324Files\Raw
Raw data file name: ORDERS.980506.D01
Table definition: COBOL FD\ORDER_DETAIL
Source file directory: \DS324Files\Source
Source file name: Orders.src
Table definition: DS324Files\Orders

Copyright © 2003 Ascential Software, Inc.


22 DataStage Essentials Server Edition

Module 8 Exercise: Hashed Files


Assumptions:
 lkpSkeyLoad job exists in the project.

Task 8-1: Add hashed file stages to an existing job


Open the lkpSkeyLoad job. Modify the job to add hashed file stages for each of the
hanging links. Set the hashed file stages to create the file. Use parameters for the
project and file name.

1. Open the lkpskeyLoad job.


2. Drag a Hashed File stage from the Favorites group and attach one to
each of the hanging links.
3. Rename all hash stages to HA_SKEY_{feed name}. The job should look
like Fig. 8-1.

Fig. 8-1

4. Open one of the hashed file stages.


5. On the Input tab verify that the Create file box is checked.
6. Verify that the hashed file name matches the stage name.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 23

7. Verify on the Options page that the file type to be created is Type
30(Dynamic).
8. Verify that write caching is enabled.
9. Verify that a key column has been selected and is valid.
10. Repeat steps 4-9 for the other hashed files.
11. Save, compile, and run the job.
12. Verify that no records were loaded into each of the hash files, but that the
files exist. You can determine that the files were created by right clicking
on the appropriate hashed file stage and clicking on View link name data.
If the file exists but has no records, you will see ‘Data source empty.’ This
is perfectly acceptable. If the file does not exist, an error message will
appear noting that the engine could not locate the file name.

Copyright © 2003 Ascential Software, Inc.


24 DataStage Essentials Server Edition

Module 9 Exercise: Transforming Data


Assumptions:
 The trnTemplate job exists in the project.

Task 9-1: Create a new template from a job


Create a template that will be used as a skeleton for building the transformation
jobs.

1. Create a new Template from job.


2. Use the trnTemplate job and name the template trnTemplate. Place the
template in the DS324 Warehouse Templates category.

Task 9-2: Create a new job from a template

1. Create the trnCustomers and trnProducts jobs from the trnTemplate.


2. Create the jobs in the DS324 Warehouse\TRN category.
3. Rename stages and links accordingly. The trnCustomers job is shown in
Fig. 9-1.

Fig. 9-1

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 25

Task 9-3: Import metadata from the hashed files


Before setting up the hashed file stages, it is recommended to import the metadata
from the hashed files. Hashed file names, column names, and key structures can be
keyed or entered incorrectly causing the data to not be read or read incorrectly.

1. Right click on the Table Definitions branch in the repository window to


invoke the shortcut menu.
2. From the Import menu select Universe File Definitions…
3. From the DS324 project, select all of the surrogate key hashed files that
were created earlier.

Task 9-4: Setup the source stage

1. In the source stage of the trnCustomers job, load the Customers table
definition. Modify the format in the stage if necessary.
2. View data providing an appropriate Source file directory if one is not
present.
3. Close the Properties page.

Task 9-5: Setup the hashed file stages

1. Use the drop-down lists to select the project name and hashed file name in
each hashed file stage.
2. Load the HA_SKEY_CUSTOMERS table definition into both hashed file
stages.

Task 9-6: Setup the constraints


Usually, in order to send only those records that are a match from the lookup file, we
set the constraint to filter out the unmatched records and reject them. In this case,
records will be inserted into the table when a match does not occur and records will
be updated into the table when records do match. We will also send records to the
target hashed file when a match does not occur. This will allow the hashed file to be
updated with new records before the next record is read into the transformer.

1. Open the Transformer.


2. Click the Constraints button on the Standard Toolbar.
3. Open Expression editor by double clicking in the Constraint cell of the
insert link.
4. Click the browse button to bring up the Suggest Operand Menu.
5. Select Function from the Operand Menu.
6. Expand the Null handling category and select IsNull.
7. Click the browse button to bring up the Suggest Operand Menu.
8. Select Input columns from the menu.
9. Expand the lkpHA_SKEY_CUSTOMERS link and select the key column of
CustomerID.
10. Press enter to accept the constraint.
11. Repeat steps 3-10 for the rejHA_SKEY_CUSTOMERS link.
12. Repeat steps 3-10 for the update link with an extra Operand of Not. The
constraints should look like those in Fig. 9-2.

Copyright © 2003 Ascential Software, Inc.


26 DataStage Essentials Server Edition

Fig. 9-2

13. Click OK to close the Constraints window.

Task 9-7: Define lookup expression and output column


derivations
1. Drag the CustomerID column of srcIn to the Key Expression field of the
CustomerID column in the lkpHA_SKEY_CUSTOMERS link.
2. Click into the insert link and click the Load Column Definitions button
from the Standard Toolbar.
3. Select the ODBC\DS324\Customers table definition and click OK.
4. Repeat steps 2 and 3 for the update link.
5. Select the Auto-match button from the Standard Toolbar.
6. Select srcIn for Input link and insert for Output link click OK.
7. Repeat steps 5 and 6 choosing update for the Output link.
8. Repeat steps 5 and 6 choosing rejHA_SKEY_CUSTOMERS for the Output
link.
9. Open the Expression editor by double clicking in the Derivation cell of the
CustomerKey column of insert link.
10. Click the browse button to bring up the Suggest Operand Menu.
11. Select System Variable.
12. Select @INROWNUM from the list of system variables.
13. Repeat steps 9-12 for rejHA_SKEY_CUSTOMERS link.
14. Drag the CustomerKey from lkpHA_SKEY_CUSTOMERS to the Derivation
field of the column CustomerKey in the update link.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 27

Task 9-8: Setup the target ODBC stage


In this step, we will define the properties in the ODBC stage for the insert and update
actions. It is Ascential Software’s suggestion that records be written to a sequential
file in order to use a bulk loader for the inserts and a separate stage for the updates
only. We will perform the insert and update on separate links but directly to the
table due to the shortness of time in the class.

1. Double-click on the odbc stage.


2. Click the Get SQLInfo button.
3. Click the Inputs tab.
4. Select insert from the Input name drop-down list.
5. Enter Customers into the Table name field.
6. Select update from the Input name drop-down list.
7. Enter Customers into the Table name field.
8. Close Properties.
9. Save, compile, & run the job.
10. Record number of records inserted and updated.

_____91____ Inserted

_______0__ Updated

Task 9-9: Repeat previous steps for the srcProducts job


Repeat the Task 9-4 through Task 9-8 for the srcProduct job that was created from
template.

Task 9-10: Create the trnOrders job


Create one more job from the trnTemplate template, saving it as trnOrders. Add
another transformer to the job to perform the record validation step in the job. Move
the link between the source stage and the transformer to the new transformer, and
then create a new link to the source stage. Add the three dimension lookup files to
validate the incoming records and reject any records to a separate file (as shown in
Fig. 9-3). Perform the necessary business rule transformations to the records and
pass them to the next transformer stage. Perform the surrogate key validation
(lookup into the HA_SKEY_ORDERS hashed file), setting up the constraints to either
insert or update the record.

1. Create trnOrders job from the trnTemplate.


2. Unattach the link from the source file to the transformer at the transformer
end.
3. Add a Transformer and rename it to xfmFilter.
4. Attach the existing link to xfmFilter.
5. Add a link from the new transformer to the original transformer and name it
ValidOrdersOut.
6. Add three Hashed File stages as reference lookups, and one Sequential File
stage as an output.
7. Link and rename all components as shown in Fig. 9-3.

Copyright © 2003 Ascential Software, Inc.


28 DataStage Essentials Server Edition

Fig. 9-3

8. Select the project and hashed file name from the drop-down lists and load the
table definitions for the new hashed file stages (HA_SKEY_CUSTOMERS,
HA_SKEY_PRODUCTS, HA_SKEY_TIMES).
9. Setup the HA_SKEY_ORDERS and HA_SKEY_ORDERS2 hashed file stages
and ODBC target stage as you did in the trnCustomers and trnProducts
jobs.
10. Load the Orders sequential file table definition for the rejects output file stage
and the source input file stage. In addition to loading the table definition in
the source file stage, you must also modify the file name and format of the
file. Removing the link from the stage removed the previously saved
information.
11. Open the xfmFilter transformer.
12. Setup the lookup expressions by mapping the columns from the stream input
link that correspond to the key columns in the hashed file lookup links.
13. Set the constraints to reflect a valid lookup in each hashed file stage. Don’t
forget to connect the statements with an ‘And’ statement, not an ‘Or’
statement.
14. Use the ODBC\Orders table definition for the columns in the
ValidOrdersOut link. All of the columns will be used except the OrderKey
column—you can delete it. The remaining columns will have a corresponding
column in the input links, with the exception of the SalesAmount column.
The derivation for that column comes from the Quantity, UnitPrice, and
Discount columns. It should look similar to the following expression:

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 29

(Quantity * UnitPrice) * (1 – Discount)


15. Open the xfmMap transformer.
16. Map the column from the input link that corresponds with the key column in
the hashed file lookup link.
17. Set the constraints as you did in the previous transformation jobs.
Remember that records are inserted when a match does not occur, and
records are updated when a match does occur.
18. Save, compile and run the job.
19. Record number of records inserted and updated.

_________ Inserted

_________ Updated

Copyright © 2003 Ascential Software, Inc.


30 DataStage Essentials Server Edition

Module 10 Exercise: DataStage BASIC


Assumptions:
 Orders.src file has been created from a previous job run.

Task 10-1: Create the order detail tables


Build a job that uses the Orders.src file to load shipped order detail records into the
SHIPPED_ORDERS table and unshipped orders to the UNSHIPPED_ORDERS table.
Only send records to the SHIPPED_ORDERS table that have a value in the
ShippedDate column.

1. Create a new server job.


2. Insert a sequential file stage for the source file, a transformer, and an ODBC
stage as the target table. Link the stages together and label the links and
stages as shown in Fig. 10-1. Save the job as odbcOrderDetails.

Fig. 10-1

3. Define the source file stage using Orders.src as the file name. Also define
the format of the file and load the column definitions.
4. Open the xfmOrderDetails transformer and create the following columns in
the ShippedOrdersOut output link:

Target Column Data Type Derivation

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 31

OrderID Integer / Key Straight map


OrderDate transformed to
OrderDate Timestamp
ODBC format
Extracted year in format:
OrderYear Char(4)
“YYYY”
Extract quarter as integer 1
OrderQuarter Integer
to 4
RequiredDate transformed
RequiredDate Timestamp
to ODBC format
ShippedDate transformed to
ShippedDate Timestamp
ODBC format

DaysToShip Integer Number of days between


order date and shipped date
Number of days between
DaysToRequired Integer required date and shipped
date
Was it shipped after the
WasLate Char(3) required date? “Yes” or “No

5. Create three new stage variables to hold the date values. Create derivations
for each of the dates that will convert the date strings into the internal day
number format.
6. Define a constraint to send records with a valid ShippedDate to the
ShippedOrdersOut link. Reject any records that do not have a ship date.
7. Utilize the stage variables to create the proper derivations for each of the
output columns based on the rules listed in the table above.
8. Map all of the input columns directly to the UnshippedOrdersOut link.
These records can be processed by another job.
9. Name the output tables in the odbc stage as SHIPPED_ORDERS and
UNSHIPPED_ORDERS. On each input link in the odbc stage, have the stage
create the table before the records are loaded.
10. Save, compile, & run the job.

Copyright © 2003 Ascential Software, Inc.


32 DataStage Essentials Server Edition

Module 11 Exercise: Repository Objects


Assumptions:
 The odbcOrderDetails job has been created.

Task 11-1: Examine date formats and locate SDK


transforms
Examine the Orders.src file and verify the format of the three dates. Locate
transforms in the SDK that can modify the dates into a proper format to be loaded
into a table.

1. Open the Orders.src file in a text editor or a data browser within a job.
2. Note the format of the three date columns:
OrderDate ________________________
RequiredDate _____________________
ShippedDate _______________________

3. Expand the Transforms branch of the repository. Expand the SDK category
and browse through the Date categories. Locate possible transforms that
can read dates in the form of the three dates above. Pay particular attention
to transforms that can output a date in an ODBC format. Note the transforms
that may be suitable for the dates.
OrderDate ________________________
RequiredDate _____________________
ShippedDate _______________________

Task 11-2: Modify dates using SDK transforms


Open odbcOrderDetails job and modify the dates in the transformer stage using
the appropriate SDK transforms.

1. Open the odbcOrderDetails job and save it as odbcOrderDetailsSDK.


2. Modify the existing derivations for the dates and apply the appropriate SDK
transforms.
3. Save, compile and run the job.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 33

Module 12 Exercise: Performance Enhancements


Assumptions:
 trnProducts and trnOrders jobs exist.

Task 12-1: Compare processing times with in-process row


buffering enabled
Open the trnProducts job and enable in-process row buffering. Save the job as
trnProductsRowBuff. Run the job and compare the processing times between the
jobs.

1. Open the trnProducts job and save it as trnProductsRowBuff.


2. On the Job Properties page, go to the Performance page.
3. Uncheck Use Project Defaults and check Enable Row Buffer.
4. Select In Process.
5. Save, compile and run the job.
6. Compare the run times of this job with the run times of the trnProducts job.

Task 12-2: Compare processing times with inter-process


row buffering enabled
Open the trnOrders and enable inter-process row buffering. Save the job as
trnOrdersRowBuff. Run the job and compare the processing times between the
jobs.

1. Open the trnOrders job and save it as trnOrdersRowBuff.


2. On the Job Properties page, go to the Performance page.
3. Uncheck Use Project Defaults and check Enable Row Buffer.
4. Select Inter Process.
5. Save, compile and run the job.
6. Compare the run times of this job with the run times of the trnOrders job.

Task 12-3: Use an IPC stage to create multiple processes


Using the trnOrdersRowBuff job, insert an IPC stage between the two transformer
stages. Save the job as trnOrdersIPC. Run the job and compare the processing
times between the two jobs (row buffering and IPC).

1. Open the trnOrdersRowBuff job and save it as trnOrdersIPC.


2. Drag an IPC stage on to the canvas and place it between the two transformer
stages.
3. Drag the end of the ValidOrdersOut link off of the xfmMap stage and drop
it onto the IPC stage. Make a new link from the IPC stage to the xfmMap
stage.
4. Rename the new stage and links as shown in Fig. 12-1.

Copyright © 2003 Ascential Software, Inc.


34 DataStage Essentials Server Edition

Fig. 12-1

5. Open the properties page of the IPC stage and go to the Outputs page.
6. On the columns page, load the ODBC\Orders table definition.
7. Open the xfmMap stage and re-map the columns if necessary.
8. Save, compile and run the job.
9. Compare the run times of this job with the run times of the
trnProductsRowBuff job.

Task 12-4: Partition and collect records in a job


Using the trnOrdersRowBuff job, insert a link partitioner and link collector and two
more transformers. Connect the necessary links and propagate columns along all
links so that column metadata exists on all links. Set the partitioner and collector
methods to Round Robin. Save the job as trnOrdersPartitioned. Run the job and
compare the processing times between the two jobs (trnOrdersIPC and
trnOrdersPartitioned).

1. Open the trnOrdersRowBuff job and save it as trnOrdersPart.


2. Drag a link partitioner stage, a link collector stage, and two transformers onto
the canvas and place them between the two transformer stages. You may
need to zoom out to be able to see everything on the canvas.
3. Drag the end of the ValidOrdersOut link off of the xfmMap stage and drop
it onto the link partitioner stage. Make two new links from the link partitioner

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 35

to the transformer stages. Link the transformers to the link collector stage.
Make a new link from the link collector stage to the xfmMap stage.
4. Rename the new stages and links as shown in Fig. 12-2.

Fig. 12-2

5. Beginning with the link partitioner stage, propagate the columns along each
of the links. Finish propagating columns with the link collector stage to the
xfmMap stage.
6. Open the properties page of the link partitioner stage and go to the Outputs
page.
7. On the columns page, load the ODBC\Orders table definition.
8. Open the xfmMap stage and re-map the columns if necessary.
9. Save, compile and run the job.
10. Compare the run times of this job with the run times of the
trnOrdersRowBuff job.

Copyright © 2003 Ascential Software, Inc.


36 DataStage Essentials Server Edition

Module 13 Exercise: Job Sequencer


Assumptions:
 All processing jobs necessary for the warehouse load are created in the DS324
Warehouse category.

Task 13-1: Create a sequence to control the source jobs


Create a new sequence (seqSrc) for the source jobs (srcProducts, srcOrders,
srcCustomers, and srcTimes). Setup the necessary parameters in the sequence
and pass them directly to each of the jobs.

Fig. 13-1

Task 13-2: Create a sequence to control the lookup job


Create a new sequence (seqLkp) for the lookup job (lkpSkeyLoad). Setup the
necessary parameters in the sequence and pass them directly to the job.

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 37

Fig. 13-2

Task 13-3: Create a sequence to control the dimension


table transformation jobs
Create a new sequence (seqTrnDim) for the dimension transformation jobs
(trnProducts, trnCustomers, and trnTimes). Setup the necessary parameters in
the sequence and pass them directly to each of the jobs.

Copyright © 2003 Ascential Software, Inc.


38 DataStage Essentials Server Edition

Fig. 13-3

Task 13-4: Create a sequence to control the fact table


transformation job
Create a new sequence (seqTrnFact) for the fact transformation job (trnOrders).
Setup the necessary parameters in the sequence and pass them directly to the job.

Fig. 13-4

Copyright © 2003 Ascential Software, Inc.


DataStage Essentials Server Edition 39

Task 13-5: Create a sequence that controls the


transformation sequences
Create a new sequence (seqTrn) for the transformation sequences (seqTrnDim and
seqTrnFact). Insert the job activity stages directly from the repository. Setup the
parameters in the sequence and pass them directly to each of the sequences.

Fig. 13-5

Task 13-6: Create the master sequence that controls all


of the sequences
Create a new sequence (seqMaster) for all of the sequences (seqSrc, seqLkp, and
seqTrn). Setup the necessary parameters in the sequence and pass them directly
to each of the sequences.

Task 13-7: Run the master sequence


Run the Utility job to clear all hashed files and tables. Run the master sequence and
verify that all of the jobs ran successfully.

Copyright © 2003 Ascential Software, Inc.


40 DataStage Essentials Server Edition

Fig. 13-7

Copyright © 2003 Ascential Software, Inc.

You might also like