[go: up one dir, main page]

0% found this document useful (0 votes)
51 views14 pages

ETL Testing

ETL Testing

Uploaded by

jayamalapv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views14 pages

ETL Testing

ETL Testing

Uploaded by

jayamalapv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Next →

ELT Testing Tutorial

ELT Testing tutorial provides basic and advanced concepts of ELT Testing.
tutorial is designed for beginners and professionals.
ETL tools extract the data from all the different data sources, transform
(after applying joining fields, calculations, removing incorrect data fields e
into a data warehouse.
ETL testing is done to ensure that the data has been loaded from a sourc
after business transformation is accurate. It also involves the verifica
various stages that used between source and destination

ETL (Extraction, Transformation and Loading) Testing


ETL testing is done before data is moved to production data warehouse sy
called as table balancing or product reconciliation. ETL testing is differen
testing in terms of its scope and the steps followed during this testing.
PauseNext
Unmute

Current Time 9:05

/
Duration 18:10
Loaded: 55.77%
Â
Fullscreen
ETL testing is to ensure that the data which has been loaded from a sourc
after transformation is accurate. It involves the verification of data at
which is used between source and destination.

Process for ETL testing


Like other testing process, ETL testing also go through some testing proces
ETL testing performed in five stages.
1. ETL testing identifies data sources and requirements.
2. Data recovery
3. Implement dimensional modeling and business logic.
4. Build and populate data
5. Build reports
Types of ETL testing
The types of ETL testing are:
1. New Data Warehouse Testing: It is built and verified from the core
the input is taken from the customer's requirement and different data so
the new data warehouse is built and verified with the help of ETL tools.
Here are the responsibilities which are played by different groups:
o Business Analyst: Business Analyst gathers and documents the req
o Infrastructure People: These people set up the test environment.
o QA Testers: QA Testers develop test plans and test scripts and the
test plan and scripts.
o Developers: Developers perform the unit test for each module.
o Database Administrator: Database Administrator test for the p
also for the stress.
o Users: Users do functional testing, which includes UAT (User Accepta
2. Production Validation Testing: This testing is done on data when d
production systems. Informatica Data Validation option provides the au
testing and management capabilities to ensure that the data do n
production systems.
3. Source to Target Testing (Validation): This type of testing is done
data values transformed the expected data values.
4. Application Upgrade: This type of ETL testing is automatically ge
saves the test development time. This type of testing checks the extract
older application are precisely same as the data in a new application.
5. Metadata Testing: Metadata testing includes the measurement of
length of data, and check index/constraint.
6. Data Accuracy Testing: This testing is done to ensure that the da
loaded and transformed as expected.
7. Data Transformation Testing: Data transformation testing done in
cannot be achieved by writing one source SQL query and comparing the
target. Multiple SQL queries need to be run for each row to verify the trans
8. Data Quality Testing: Data Quality Tests includes syntax and referen
any error due to date or order number during business process
done. Syntax tests: It will report dirty data, based on invalid chara
pattern, incorrect upper or lower case order, etc. Reference Tests: It wil
according to the data model.
For Example, Customer ID data quality testing includes number che
precision check, date check, etc.
9. Incremental ETL Testing: This testing is done to check the data inte
new data when the new data added. Incremental testing verifies th
processes correctly even after the insertion and updating the data during
ETL process.
10. GUI/Navigation Testing: This testing is done to check the navigatio
of the front end reports.
11. Migration Testing: In this testing, the customer has an existing d
and ETL is performing the job. But customers are looking for tools to impr
includes these steps:
o Design and validation tests
o Setting up the test environment
o Executing the validation test
o Reporting the bugs
12. Change Requests: In this case, data added to an existing data w
might be condition arises where customers require to change the present
they can integrate new rule.
13. Report Testing: The final result of the data warehouse, reported
should test by validating the data, layout in the report. Reports are an es
for creating vital business decisions.
Tasks performed in ETL Testing
Tasks involved in ETL testing are:
o Understanding of data, used for reporting
o Data Model Reviewing
o Mapping of the source to target
o Checks the data in the source data
o Validation of packages and schema
o In the target system, data verification should be done
o Verification of aggregation rules and data transformation calculation
o Data comparison between the target system and data source
o For the target system, quality and data integrity should be examined.
o Performance testing of data.
Differences between the ETL and the Database Testing
ETL and database testing involve data validation, but both are not same
usually performed on data in a data warehouse, whereas, database testi
on transactional systems. Data comes into the transactional database
applications.
Operations performed in ETL Testing
ETL testing involves the following operations:
o Validation of data movement from source to the target system.
o Data count verification in the source and target system.
o ETL testing verifies the transformation, extraction as per re
expectation.
o ETL testing verifies if table relations join and keys are preserv
transformation.
The operation performed in Database Testing
Database testing focuses on data accuracy, the correctness of data, and va
Database testing performs the following operations:
o Database testing focuses on verification of the column in a table tha
values.
o To verify whether the primary or foreign key is maintained, database
o Database testing verifies if the data is missing in the column. Here, w
there any null values in columns which should have a valid value?
o We verify the accuracy of data in columns.
For example, the Number of month's column shouldn't have a value grea

Function ETL Testing Database Testing

Primary ETL testing is performed for data Database testing is


Goal extraction, transformation and validate and integra
loading for BI reporting.

Business ETL testing used for information, This testing is use


Need forecasting, and analytical reporting. the data fro
applications and ser

Applicable ETL testing contains historic data that ETL testing c


System cannot be used in business flow transactional syste
environment. flow of business occ

Modeling The multidimensional method is ER method is used.


used.

Database ETL testing is applied to OLAP Database testing is


Type systems. system.

Data Type ETL uses the de-normalized data with The database use
fewer joins, more indexes, and data with joins.
aggregations.

Common QuerySurge, Informatica, etc. tools QTP, Selenium too


Tools are used. database testing.
ETL performance Testing
ETL performance testing is used to ensure if an ETL system can handle a
of multiple users and transactions. Performance testing involves server-s
the ETL system.
How to perform ETL testing performance?
Here are the following steps which are followed to test the performance of
Step 1: Find the load which transformed in production.
Step 2: New data will be created of the same load or move it from prod
local server.
Step 3: Now, we will disable the ETL until the required code is generated.
Step 4: We will count the needed data from the database table.
Step 5: We will note down the last run of ETL and enable the ETL. It will ge
to transform the entire load which has created and run it.
Step 6: After the completion of ETL, we will count the created data.
Essential performance that should be noted:
o Find out the total time taken to transform the load
o Find out the performance that has been improved or dropped.
o We will check if the entire expected load is extracted and transferred
Data Accuracy in ETL Testing
In ETL Testing, we focus on data accuracy to ensure whether the data is ac
to the target system as per our expectations.
Here are the steps which should be followed to perform the data a
Value Comparison: In value comparison, we compare the data in the so
system with minimum or no transformation. ETL testing can be possible b
ETL tools. For example, Source Qualifier Transformation in Informatica.
Expression Transformation can also be performed in data accuracy
operators can be used in SQL statements to check the data accuracy in the
target systems.
Check the columns of critical data: Critical Data columns can
comparing the distinct values in the source and the target system.
1. SELECT cust_name, order_id, city, count(*) FROM customer GROUP B
der_id, city;
ETL testing in data transformation
It is quite complex to perform the data transformation because it cannot
writing a single SQL query and comparing the output with the target. To do
for Data Transformation, we have to write multiple SQL queries for each r
transformation rules.
To perform the successful ETL testing for data transformation, we ha
sufficient and sample data from the source system to apply the transforma
The significant steps to perform ETL testing for data transformatio
Step 1. The first step is to create a scenario for input data and the expec
we will validate ETL testing with the business customer. ETL testing is th
to gather the requirements during designs and can be used as a part of tes
Step 2. The second step is to create the test data according to th
developer will automate the entire process of populating the datasets w
spreadsheet permit versatility and mobility for the reason that the situatio
Step 3. Utilize the data profiling and the results will compare the range an
values in each field between the source and the target data.
Step 4. We will validate the accurate processing of ETL generated field
Surrogate keys.
Step 5. We will validate the data types within the warehouse that a
specified in the data model or design.
Step 6. Scenarios of data will be created between tables which test
integrity.
Step 7. We will validate the parent to child relationship in the data.
Step 8. And at the end, we will perform lookup transformation. Look
be straight without any data gathering and expected to return only one v
source table. We can directly join the lookup table in the source qualifier
case, we will write a query which will join the lookup table with the m
source and will compare the data in the corresponding column in the targe
ETL Test Cases
The objective of ETL testing is to assure that the loaded data f
destination after business transformation is accurate.
ETL testing applies to different tools and databases in the informatio
industry.
During the ETL testing performance, two documents always used by the E
are:
1. ETL mapping sheets: ETL mapping sheets contain all the informatio
and destination tables, which includes every column and their lookup i
table. ETL tester needs to be comfortable with SQL queries as ETL test
writing big queries with multiple joins to validate the data at any sta
mapping sheets provide significant help when we write queries for data ve
2. DB Schema of Source (Target): It should be kept accessible to ver
mapping sheet.
ETL Test Scenarios and Test Cases:

ETL Test Scenario ETL Test Cases

Mapping doc We will verify the mapping document whether the E


validation provided or not. Log change should maintain in every m

Validation o We will validate the target and source table stru


corresponding mapping doc.
o The data type of source and target table should b
o Length of the data type of both source and targe
same.
o We will verify the data field type, and form
specified.
o The length of the source data type should not b
length of the target data type.

Constraint The constraint should be defined for a specific tab


Validation expectation.

Data o Data Type and length of a particular attribute m


Consistency
or tables through the semantic definition.
Issues
o Misuse of integrity constraint.

Completeness o Here, we have to be ensure that all the expected


Issues
into the target table.
o In this scenario, record counts will be compared b
and target.
o We will check the rejected records.
o Data should not be truncated in the column of
table.
o Will check the boundary value analysis.
o We will compare unique values of critical field
data loaded in warehouse and source data.

Correctness o This scenario is used to correct the data, which i


Issues
inaccurately recorded.
o To correct the data, that is null, non-unique, and o

Transformation o This scenario is used to check the transformation

Data Quality o This scenario is used to check the number and va


o Data Check: This scenario will follow the date
should be same for all the records.
o Precision check
o Data check
o Null check

Null Validate o This scenario will verify the null values, where "N
are specified for a specific column.

Duplicate o In this scenario, we will check the validation o


Check
primary key, and any other column should be
business requirement having any duplicate rows.
o We will check if any duplicate values exist in any
is extracted from multiple column sources and
into one column.
o As per the client requirements, we need to ens
are no duplicates in a combination of multiple
target only.

Date Validation o Date values are using many areas in developme


row creation date.
o Identify the existing records as per the ETL
perspective.
o Sometimes on the date values, the updates a
generated.

Data Cleanness o The unnecessary column should be removed befo


the staging area.
Types of ETL Bugs
Types of ETL Bugs Description

User Interface Bugs These bugs are related to the Graphical User I
application such as, color, font style, navigation,
etc.

Input-Output Bugs In this type of bug, the application starts taking


and the valid values are rejected.

Boundary value These bugs check for the minimum and maximum
analysis bug

Calculation Bugs Calculation bugs show the mathematical errors, a


time the final output is wrong.

Load Condition These types of bugs don't allow multiple users. It


Bugs the data which is user accepted.

Race Condition In this kind of bugs, the system will not run pro
Bugs crashing or hanging.

Equivalence Class This type of bug results invalid or invalid types.


Partitioning bugs

Version Control These types of bugs usually occur in Regression


Bugs not give any information on versions.

Hardware Bugs In this type of bug, the device will not re


application as expected.

Help Source Bugs This bug will result as the mistakes in the help do
Responsibility of ETL tester
ETL tester is responsible for validating the data sources, applying trans
and loading the data in the target table, extraction of data.
The responsibilities of ETL tester are:
Verify the table in the source system. It involves the following types o
o Count Check
o Data Type check
o Reconcile records with source data
o Ensure no spam data is loaded
o Remove duplicate data
o Check all the keys are in place
Apply Transformation Logic
Transformation logic is applied before loading the data. It involve
operations:
o Transformation logic is applied before and after checking the record o
o Validation of data flow from the staging area to the intermediate table
o Check the data threshold validation; for example, the age value sho
than 100.
o Check the surrogate key
Data Loading
Data is loaded from the staging area to the target systems. It involve
operations:
We will check if the aggregate values and calculated measures loaded in th
o During the loading of the data, we will check the modeling views bas
table.
o We will check, if the CDC has been applied to the incremental load ta
o Check the data dimension table and review the history of the table.
o Check the reports of BI which are based on the loaded fact and dim
per the expected results.
Testing of ETL Tools
ETL testers are required to test the test cases and tools as well. It involv
operations:
o Test the ETL tool and its functions
o Test the ETL Data Warehouse system
o Create, design and execute the test cases and test plan
o Test the flat file data transfer
Advantages of ETL Testing
Benefits of ETL testing are given below:
1. ETL testing can extract or receive data from any data sources at the s
2. ETL can load the data from heterogeneous sources to a single genera
different target at the same time.
3. ETL can be able to load different types of the goal at the same time.
4. ETL can be able to extract required business data from various sou
needed load business data into the different target as the desired form
5. ETL can perform any data transformation according to the business.
Disadvantages of ETL Testing
Disadvantages of ETL testing are given below:
1. One of the main disadvantages of ETL testing is that we must be
developer or database analyst to use it.
2. When we need a fast response, it is not ideal for real-time or on-dema
3. ETL testing will take months to put on any place.
4. It is challenging to keep the data in the changing requirement.
heading:
ETL testers are required to test the test cases and tools as well. It involv
operation:
o Test the ETL tool and its function
o Test the ETL Data Warehouse system
o Create, design and execute the test cases and test plan
o Test the flat file data transfer
Future Scope of ETL Testing
The scope of ETL testing is very bright. ETL tools like Informatica Power
Data Integrator, Microsoft SQL server integrated service, SAS, I
information server, etc. all are in huge demand in the industry because
The scope of ETL testing will increase in the future.
Conclusion
ETL testing is a type of business testing in which developers, business ana
and DBAs are involved. ETL testing requires the knowledge of SDLC an
and the tester should know how to write the SQL queries. Many business
as a challenge, but the fact is that it is beneficial for the business. It is ess
the data from loss, and it is necessary to update the data to meet the requ
market.

You might also like