About
Subscribe

Realising an information vision

Franc Trivella, SAS Institute`s marketing manager, examines the need for a comprehensive data warehousing methodology, which is vital as data warehousing becomes increasingly popular as companies realise how important an information vision is to their competitiveness.
Johannesburg, 25 Jun 1999

The practice of warehousing has grown in popularity as organisations realise the benefits of the information vision to their competitiveness. This is the case across multiple industry sectors and different sizes of organisations.

A methodology offers an organisation the benefit of previous experiences and a systematic approach to a complex task

One of the challenges that SA in particular faces is the lack of skilled staff who can implement a warehouse, and hence the need for a comprehensive warehousing methodology. It`s a critical issue since most business solutions that lead to competitive advantage are built on data warehouse infrastructures.

Most organisations will admit that implementing a data warehouse is a complex undertaking. These require the unity of business and technical objectives and resources. These are often costly initiatives too. A methodology offers an organisation the benefit of previous experiences and a systematic approach to a complex task.

Data warehousing projects usually require unique and different approaches to systems development; due largely to the exploratory and evolutionary business requirements that the data warehouse has to meet. What is required is an iterative business-oriented methodology, developed specifically for data warehousing and based on past experiences of data warehouse projects.

Essential features for a warehousing methodology are based on the following six areas:

Iterative development

Because data warehouses have to deliver information based on the shifting economic sands of today`s economic climate, it is difficult to define absolutely every characteristic of the warehouse in advance. Data queries are often ad-hoc and exploratory. It is essential that the warehousing methodology remains flexible enough to accommodate unexpected changes. With an iterative approach project activities can be conducted in a cyclical fashion and phases can work in parallel. It also allows for a project to be split into smaller pieces so that a large data warehousing project is the end-product.

Scalable approach to warehouse architecture

You have to identify the key components that will affect the scalability of the data warehousing solution. This will guide you in planning and constructing scalable warehouse solutions. You can consider independent data marts, which means that you don`t have to commit to a large-scale integrated warehouse environment immediately.

This approach often delivers a higher business value than those that focus on a big bang approach. Some companies have attempted to build a monolithic data warehouse and have been disappointed. These projects are known to be vastly time and resource hungry with little business benefit. The motto is to start small, identify problems early on and then move a little bigger.

The best of both worlds is a combined or evolutionary approach to an integrated data warehouse environment. If you take this approach you will find that you`ll avoid the problems inherent in a monolithic warehouse, but you`ll reap the benefit of a smaller data mart project. In this case an enterprise subject model can be created which outlines the data elements across the organization and which provides a plan for how data will be integrated in future iterations.

A blue print is created at a high level. This way you can reduce the up-front investment in an enterprise warehouse, you can eliminate redundant extracts of data, and central IT resources can be used to support the warehouse infrastructure.

Business focus and involvement

A key success factor is focusing your data warehousing solution on business requirements all the time, and ensuring that the business community is involved. They also have roles to play in project sponsorship, business requirements definitions, testing and validation of data, on-going feedback and workshop participation.

Effective use of workshops

A workshop is an efficient means to gather the high-level business requirements and feedback - through surveying, interviewing, prototyping and modeling. This is effective especially where input from multiple business units is required. In the early phases workshops can provide objectives, scope and priorities. They are useful in drawing in participants from different business areas. Workshops are also useful in reducing the amount of time needed to gather information from a wide group of people.

Emphasis on quality

Quality control is integral to any data warehousing project. Protocols for ensuring data integrity should be drawn up and refined while the warehouse is being designed, populated and maintained.

Clearly defined project responsibilities are necessary to ensure that the project undergoes the path of most rapid development with input from all the right personnel.

There are six phases that each data warehousing solution project should follow. These include assessment, requirements, design, construction, deployment, review, maintenance and administration.

The assessment phase is crucial to ascertain whether the organization is ready to undertake a data warehousing project and what the scope should be.

During the requirements phase the high-level needs of the entire warehouse environment are addressed. This helps to avoid the paralysis that many waterfall approaches are known to induce.

Design phase concentrates on building up one project at a time. This phase includes the detailed analysis and requirements for the build, detailed logical designs for the model, specifications for extraction, transformation and loading.

During the construction phase implementation teams will code and populate the warehouse with data and develop the applications for end-use analysis and reporting. Business users and IT managers will test the data warehouse rigorously.

The deployment phase is essentially the roll-out phase of the data warehouse and end-user applications to the end-users and the IT staff.

During this phase the users should be well trained, and applications and data should be readily accessible. The faster business users gain some benefit from the data warehouse the more likely they will be to support future developments and enhancement efforts.

In the review phase there should be three consecutive review periods in three to six month periods.

Finally, in the maintenance and administration phase the constant refreshing of the data warehouse should be undertaken to prevent lethargy setting in. This includes the addition of new users, delivery of user training, addition of queries and aggregate data. Continual data management should be undertaken, and the warehouse should be monitored in terms of capacity requirements, updates to data and applications availability.

  • Next week: We`ll examine some data warehousing case studies.

Share