Posted: Jun 13, 2020
What is ETL? ETL is the process where process, data is fetched from the source systems and transformed as per business rules and loaded to the target system which is referred to as data warehouse. A data warehouse is a huge collection of data which contains integrated data that aids in the business decision-making process. It is a part of business intelligence.
What is ETL Testing?
ETL Testing is process of validating data loaded in DWH systems using query and contents present in file.
Why do organizations need Data Warehouse?Associations with sorted out IT rehearses are anticipating making the following degree of innovation change. They are currently attempting to make themselves substantially more operational with simple to-interoperate information.
Having said that information is most significant piece of any association, it might be regular information or authentic information. Information is the foundation of any report and reports are the pattern on which all the crucial administration choices are taken.
The greater part of the organizations are stepping forward for building their information distribution center to store and screen ongoing information just as chronicled information. Making a productive information distribution center isn't a simple activity. Numerous associations have circulated offices with various applications running on disseminated innovation. ETL Testing Training In Chennai
ETL apparatus is utilized so as to make an impeccable reconciliation between various information sources from various offices. ETL instrument will fill in as an integrator, separating information from various sources; changing it into the favored configuration dependent on the business change rules and stacking it in strong DB known are Data Warehouse.
ETL Testing Techniques
- Data transformation Testing: Verify that data is transformed correctly according to various business requirements and rules.
- Source to Target count Testing: Make sure that the count of records loaded in the target is matching with the expected count.
- Source to Target Data Testing: Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.
- Data Quality Testing: Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.
- Performance Testing: Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.
- Production Validation Testing: Validate the data in production system & compare it against the source data.
- Data Integration Testing: Make sure that the data from various sources has been loaded properly to the target system and all the threshold values are checked.
- Application Migration Testing: In this testing, it is ensured that the ETL application is working fine on moving to a new box or platform.
- Data & constraint Check: The datatype, length, index, constraints, etc. are tested in this case.
- Duplicate Data Check: Test if there is any duplicate data present in the target systems. Duplicate data can lead to wrong analytical reports.
Difference between Database and Data Warehouse Testing
There is a popular misunderstanding that database testing and data warehouse is similar while the fact is that both hold different direction in testing.
Database testing is done using a smaller scale of data normally with OLTP (Online transaction processing) type of databases while data warehouse testing is done with large volume with data involving OLAP (online analytical processing) databases.
In database testing normally data is consistently injected from uniform sources while in data warehouse testing most of the data comes from different kind of data sources which are sequentially inconsistent.
We generally perform the only CRUD (Create, read, update and delete) operation in database testing while in data warehouse testing we use read-only (Select) operation.
Normalized databases are used in DB testing while demoralized DB is used in data warehouse testing.
There is a number of universal verifications that have to be carried out for any kind of data warehouse testing.
Below is the list of objects that are treated as essential for validation in this testing:
Verify that data transformation from source to destination works as expected
Verify that expected data is added to the target system
Verify that all DB fields and field data is loaded without any truncation
Verify data checksum for record count match
Verify that for rejected data proper error logs are generated with all details
Verify NULL value fields
Verify that duplicate data is not loaded
Verify data integrity
=> Know the difference between ETL/Data warehouse testing & Database Testing.
ETL Testing Challenges
This testing is quite different from conventional testing. There are many challenges we faced while performing data warehouse testing.
Here are few challenges I experienced on my project:
Incompatible and duplicate data
Loss of data during ETL process
Unavailability of the inclusive testbed
Testers have no privileges to execute ETL jobs by their own
Volume and complexity of data are very huge
Fault in business process and procedures
Trouble acquiring and building test data
Unstable testing environment
Missing business flow information
Data is important for businesses to make the critical business decisions. ETL testing plays a significant role validating and ensuring that the business information is exact, consistent and reliable. Also, it minimizes the hazard of data loss in production.
Hope these tips will help ensure your ETL process is accurate and the data warehouse build by this is a competitive advantage for your business.
Ranga - Working as Etl Tester in Tcs with more than 5 years of exp