Lawrence Corr of DecisionOne Consulting visits SA again this September to run data warehouse design training, hosted by Sagent Technology. In this release, he emphasises the importance of requirements gathering and recommends an interview driven approach.
One particular phrase I read about data warehouse design last year has stuck in my mind. It was: "dimensional modelling really only fits the data mart environment, where requirements for processing are known before the infrastructure is built".
It comes from the conclusion of an article not unsurprisingly entitled 'The Problem with Dimensional Modelling'. I strongly disagreed with the article while agreeing with the above because I simply didn't see it as a problem, as I will explain why later. What I did take exception to was the implied opposite, that dimensional modelling cannot be used to build a data warehouse because the requirements for a data warehouse are unknown in advance.
This line of thinking follows from the early comparison: "Traditional projects start with requirements and end with data. Data warehousing projects start with data and end with requirements." From this comes the view of data warehousing as a technical endeavour to corral all an organisation's operational data into a new database, which will be able to support any analytical requirement once complete. This has led to the "Build it and they will come" approach to data warehousing. As a data driven, IT-led solution it truly conjures up the title of the 1989 Kevin Costner movie "Field of Dreams" (from which it paraphrases a line) and has been the greatest source of data warehousing failures.
It is very true that business user analytical requirements can never be fully articulated until a data warehouse exists and we simply can't know all the ad hoc requests the users will make in advance. We must not, however, use this as an excuse for not seeking out the requirements that do exist now or we run the risk of satisfying no requirements.
Unfortunately talking to the business users to gather the requirements is often way outside the comfort zone for many of us IT people. We are much happier poring over data models, examining data, talking to DBAs and even learning new modelling techniques. Hence the "Can't meet the users (they're too busy), won't meet the users (don't want to)" attitude I have seen at several organisations. Regardless of the modelling technique used, this leads to inappropriate data warehouse designs that do not reflect how users view the business and poorly prioritised implementations which do not provide answers to the most pressing business questions.
Why gather requirements?
Let us be clear firstly about why data warehouses are built. For one reason alone: to answer business questions. Only a small percentage of those questions are known in advance. To be able to answer those and the countless others that haven't yet been asked, the data warehouse must be designed to reflect the way business people think about their business. We need to talk to these people to gain this insight into the business processes as these are what the data warehouse will allow the users to measure. Once we have identified and prioritised the significant business processes worth measuring we have set the scope for the data warehouse or an iteration of its development. We must approach the relevant business people to:
* Discover how business processes work in detail - identify the dimensions;
* Determine how are they measured - identify the relevant facts;
* Identify the level of detail (granularity) at which these measurements exist;
* Gain an understanding of how this information will be used; and
* Learn what the users' success criteria are and how they might be measured.
During requirements gathering we are primarily in listening mode but it is also our chance to discuss the project goals and manage expectations as well as learn about the business and identify analytical requirements. This is why it is important to meet with other IT staff to learn about the data we are identifying. We need a data audit or put bluntly a technical reality check to see if it is possible to provide this information.
How to gather requirements
Once you have agreed that you going to have to actively gather requirements rather than just take part in a data re-modelling exercise you have to decide how you are going to do it. Here you have two choices - interviews or facilitated workshops. An interview is with an individual or a very small group; a workshop can usually be run with up to 12 participants. Interviews have the advantage of taking up a limited amount of the interviewees' time, usually an hour to an hour and a half and are far easier to schedule. Workshops may appeal as they can reduce the elapsed time for requirements gathering, ie "Let's do it all in one hit". I prefer interviews as they encourage greater participation and generate far more detail. It is very difficult to tease out all the detail you need in a large facilitated session and keep the interest of all parties present. I also think it is easier to have a learning conversation with one or two people. Good facilitation skills are a higher art form where you must learn from and control a group of people with mixed interests and agendas at the same time. For most of us it is better to use facilitated sessions later in the requirements gathering process to feed back our understanding of the requirements and to agree priorities and next actions.
There are two particularly good sources of help on interview technique for data warehouse designers:
Chapter 15 'Interviewing' of 'Data Warehouse Design Solutions' by Christopher Adamson and Mike Venerable which encourages a technique know as active listening.
Chapter 4 'Collecting the Requirements' of 'The Data Warehouse Lifecycle Toolkit' by Ralph Kimball et al. The CD that accompanies this book contains useful templates for prepared question crib sheets, interview write-ups and requirements finding documents.
Going back to the quote at the beginning, why don't I strongly disagree with it? I am prepared to let it ride that dimensional modelling only fits data mart design because I believe that data warehouses are best implemented one subject area at a time, as a set of data marts within a conformed dimensional framework. I agree that dimensional modelling works best when requirements are known because I believe data warehouses can only be successful when business requirements have been exposed.
Lawrence Corr (lcorr@decisionone.co.uk) of DecisionOne Consulting is a global authority on data warehousing and specialises in dimensional design. He has taught data warehousing classes in Europe and SA for Kimball University and reviews data warehouse designs for clients worldwide.
Editorial contacts

