[go: up one dir, main page]

0% found this document useful (0 votes)
69 views4 pages

Performance Tuning

Download as doc, pdf, or txt
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1/ 4

What are other Performance tunings you have done in your last project to increase the

performance of slowly running jobs?


Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server
using Hash/Sequential files for optimum performance also for data recovery in case job
aborts.
Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for
faster inserts, updates and selects.
Tuned the 'Project Tunables' in Administrator for better performance.
Used sorted data for Aggregator.
Sorted the data as much as possible in DB and reduced the use of DS-Sort for better
performance of jobs
Removed the data not used from the source as early as possible in the job.
Worked with DB-admin to create appropriate Indexes on tables for better performance of
DS queries
Converted some of the complex joins/business in DS to Stored Procedures on DS for
faster execution of the jobs.
If an input file has an excessive number of rows and can be split-up then use standard
logic to run jobs in parallel.
Before writing a routine or a transform, make sure that there is not the functionality
required in one of the standard routines supplied in the sdk or ds utilities categories.
Constraints are generally CPU intensive and take a significant amount of time to
process. This may be the case if the constraint calls routines or external macros but if it
is inline code then the overhead will be minimal.
Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate
the unnecessary records even getting in before joins are made.
Tuning should occur on a job-by-job basis.
Use the power of DBMS.
Try not to use a sort stage when you can use an ORDER BY clause in the database.
Using a constraint to filter a record set is much slower than performing a SELECT …
WHERE….
Make every attempt to use the bulk loader for your particular database. Bulk loaders are
generally faster than using ODBC or OLE.
Minimize the usage of Transformer (Instead of this use Copy, modify, Filter, Row
Generator)
Use SQL Code while extracting the data
Handle the nulls
Minimize the warnings
Reduce the number of lookups in a job design
Use not more than 20stages in a job
Use IPC stage between two passive stages Reduces processing time
Drop indexes before data loading and recreate after loading data into tables
Gen\'ll we cannot avoid no of lookups if our requirements to do lookups compulsory.
There is no limit for no of stages like 20 or 30 but we can break the job into small jobs
then we use dataset Stages to store the data.
IPC Stage that is provided in Server Jobs not in Parallel Jobs
Check the write cache of Hash file. If the same hash file is used for Look up and as well
as target, disable this Option.
If the hash file is used only for lookup then \"enable Preload to memory\". This will
improve the performance. Also, check the order of execution of the routines.
Don\'t use more than 7 lookups in the same transformer; introduce new transformers if it
exceeds 7 lookups.
Use Preload to memory option in the hash file output.
Use Write to cache in the hash file input.
Write into the error tables only after all the transformer stages.
Reduce the width of the input record - remove the columns that you would not use.
Cache the hash files you are reading from and writting into. Make sure your cache is big
enough to hold the hash files.
Use ANALYZE.FILE or HASH.HELP to determine the optimal settings for your hash
files.
This would also minimize overflow on the hash file.
If possible, break the input into multiple threads and run multiple instances of the job.
Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server
using Hash/Sequential files for optimum performance also for data recovery in case job
aborts.
Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for
faster inserts, updates and selects.
Tuned the 'Project Tunables' in Administrator for better performance.
Used sorted data for Aggregator.
Sorted the data as much as possible in DB and reduced the use of DS-Sort for better
performance of jobs
Removed the data not used from the source as early as possible in the job.
Worked with DB-admin to create appropriate Indexes on tables for better performance of
DS queries
Converted some of the complex joins/business in DS to Stored Procedures on DS for
faster execution of the jobs.
If an input file has an excessive number of rows and can be split-up then use standard
logic to run jobs in parallel.
Before writing a routine or a transform, make sure that there is not the functionality
required in one of the standard routines supplied in the sdk or ds utilities categories.
Constraints are generally CPU intensive and take a significant amount of time to
process. This may be the case if the constraint calls routines or external macros but if it
is inline code then the overhead will be minimal.
Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate
the unnecessary records even getting in before joins are made.
Tuning should occur on a job-by-job basis.
Use the power of DBMS.
Try not to use a sort stage when you can use an ORDER BY clause in the database.
Using a constraint to filter a record set is much slower than performing a SELECT …
WHERE….
Make every attempt to use the bulk loader for your particular database. Bulk loaders are
generally faster than using ODBC or OLE.

How can we improve the performance of DataStage jobs?


Performance and tuning of DS jobs:
1.Establish Baselines
2.Avoid the Use of only one flow for tuning/performance testing
3.Work in increment
4.Evaluate data skew
5.Isolate and solve
6.Distribute file systems to eliminate bottlenecks
7.Do not involve the RDBMS in intial testing
8.Understand and evaluate the tuning knobs available

whats the meaning of performance tunning techinque,Example??


meaning of performance tuning meaning we have to take some action to increase
performance of slowly running job by
1) use link partitioner and link collector to speedup performance
2) use sorted data for aggregation
3) use sorter at source side and aggregation at target side
4)Tuned the oci stage for 'Array Size' and 'Rows per Transaction' numerical values
for faster inserts, updates and selects.
5) do not use ipc stage at target side..............
is this only related with server jobs .because in parallel extender these things are taken
care by stages

ENVIRONMENT VARIABLE: CREATION

Here is the full FAQ on this topic:Creating project specific environment variables- Start
up DataStage Administrator.- Choose the project and click the "Properties" button.- On
the General tab click the "Environment..." button.- Click on the "User Defined" folder to
see the list of job specific environment variables.There are two types of variables - string
and encrypted. If you create an encrypted environment variable it will appears as the
string "*******" in the Administrator tool and will appears as junk text when saved to the
DSParams file or when displayed in a job log. This provides robust security of the
value.Migrating Project Specific Job ParametersIt is possible to set or copy job specific
environment variables directly to the DSParams file in the project directory. There is also
a DSParams.keep file in this directory and if you make manual changes to the
DSParams file you will find Administrator can roll back those changes to
DSParams.keep. It is possible to copy project specific parameters between projects by
overwriting the DSParams and DSParams.keep files. It may be safer to just replace the
User Defined section of these files and not the General and Parallel
sections.Environment Variables as Job Parameters- Open up a job.- Go to Job
Properties and move to the parameters tab.- Click on the "Add Environment Variables..."
button.- Set the Default value of the new parameter to "$PROJDEF".When the job
parameter is first created it has a default value the same as the Value entered in the
Administrator. By changing this value to $PROJDEF you instruct DataStage to retrieve
the latest Value for this variable at job run time.If you have an encrypted environment
variable it should also be an encrypted job parameter. Set the value of these encrypted
job parameters to $PROJDEF. You will need to type it in twice to the password entry
box.Using Environment Variable Job ParametersThese job parameters are used just like
normal parameters by adding them to stages in your job enclosed by the #
symbol.Database=#$DW_DB_NAME#Password=#$DW_DB_PASSWORD#File=#$PRO
JECT_PATH#/#SOURCE_DIR#/Customers_#PROCESS_DATE#.csv

Theare are the variables used at the project or job level.We can use them to to configure
the job ie.we can associate the configuration file(Wighout this u can not run ur job),
increase the sequential or dataset read/ write buffer.

ex: $APT_CONFIG_FILE

Like above we have so many environment variables. Please go to job properties and
click on Paramer tab then click on "add environment variable" to see most of the
environment variables.

You might also like