Getting the most out of your company data is one of the key requirements when embarking on a BI project. However, extracting this data and making sense of it, is a different challenge altogether.
Altus Viljoen, product manager at Bytes Business Solutions, looks at extract, transform and load (ETL) development and why it such a critical component in the quest to the get the most out of data.
Enormous amounts of data, disparate legacy system and non-standardised business terminology are just a few of the challenges organisations must overcome when building a data warehouse.
These obstacles can only be addressed if companies start their BI endeavours with extract, transform and load (ETL) development.
Indeed, ETL is the heart and soul of any business intelligence (BI) project. ETL processes bring together and combine data from multiple, different source systems, enabling all users to work off a single, integrated set of data - a single version of the truth.
The result is that an organisation no longer has to franticly collect data or argue whether it is correct, instead it can use the information as key process enabler and a competitive weapon.
In these organisations their data warehouses and BI solutions are not only nice-to-have but a necessity. These systems are no longer standalone and separate from operation processing - they are fully integrated with the enterprise business processes.
An effective BI environment is, therefore, based on integrated data that enables its users to make strategic, tactical and operational decisions that drive their business.
Getting it right the first time
Because ETL components lie at the core of most data warehousing projects, it is not difficult to understand why companies` greatest concerns and frustration are directly related to these tasks.
Project leaders of data warehousing implementations often ask:
* Will our ETL process complete in the allocated timeframe and meet service level requirements?
* Even if our processes handle current data volumes, will they scale to handle growing flow of data being generated by our transactional systems and Web sites?
* How can we streamline the ETL processes of our production data warehouse which is failing to handle new performance requirements?
In fact, many implementers estimate that ETL design takes up between 60% and 80% of an entire BI project.
The reason for this is quite obvious. ETL is so time-consuming because it has the unenviable task of re-integrating an enterprise`s data from scratch.
Over the span of many years, organisations have allowed their business processes to disintegrate into dozens or even hundreds of processes managed by, for example, a single business unit or department.
Equipped with ETL and modelling tools, BI teams are now expected to swoop in and rescue an organisation from this information chaos.
ETL is instrumental
Although an ETL tool will never transform the database administrator into the local miracle worker, it is one of the most critical instruments in a BI team`s toolbox.
The bottom line is that a good ETL tool in the hands of an experienced ETL designer can speed development, minimise the impact of systems changes and new requirements, and mitigate project risk.
However, a weak ETL tool in the hands of an untrained developer can wreak havoc on BI projects and budgets.
ETL: the next phase
BI is becoming more prominent, which in turn requires ETL tools to be more advanced.
Organisations now require ETL vendors to deliver more complete BI "solutions". This means handling additional back-end data management and processing responsibilities such as data profiling, data cleansing and enterprise meta data management utilities.
Also, users want ETL tools to increase throughput and performance to handle exploding volumes of data shrinking batch windows.
Near real-time data is also becoming a key requirement. ETL tools need to feed data warehouses more quickly with more up-to-date information. This in turn will enable integrated data to be delivered on a timely basis to business users, equipping them to make critical operations decisions without delay.
Editorial contacts

