Data warehousing a BI essential
In 1981, when someone mentioned data, people weren't too sure what they were talking about.
Speaking at an ITWeb event in Bryanston yesterday, data warehousing thought leader and teacher, Ralph Kimball, noted that while we may be more accustomed to talking about data today, the essential characteristics of data warehouse systems have not changed much in the last few decades. The difference today is that we are becoming overrun with data assets, he said.
Data assets have undergone a remarkable explosion in the last few years, stressed Kimball, adding that businesses should be marshalling their data assets, which involves integrating hundreds of internal and external sources. "We have to get this under control. There is now no physical limit of how much data is out there. If you have hundreds of sources but they are not compatible, it is chaos."
The data warehouse mission is remarkably durable, said Kimball. "We should focus on integration and marshalling assets, before presenting them to decision-makers."
According to Kimball, the data warehouse must be a trustworthy and reliable way of communicating the data to end-users. "Success will be measured by whether the users show up. This will only happen if you deliver in an effective way."
Business intelligence (BI) signals the shift of responsibility from data warehouse architects to the business itself, he said, adding that the business has a responsibility to be a thoughtful and sophisticated observer of business development. According to Kimball, in order for this to be successful, it is essential for business people to have a clear understanding of the roles and functions of IT. "If your IT is over in another building, you are doing things wrong. IT has to live in the end-user environment."
BI focuses on how the data is brought into the business and how it is deployed, Kimball said. In order for data warehouse architects to best support decision-making, they must understand the cycle of BI, while also understanding the user, he went on to say.
To be happy in a data warehouse career, you need to be focused on three things, Kimball suggested, these being an interest in the business, while remaining up to date with technology - and "you actually have to like end-users".
"You have to be fundamentally forgiving when they don't read the manual or when they don't like your system," he said.
"When looking at design, you must count the clicks. One click to get to the result you want is fantastic. Two clicks is pretty good. Three clicks is acceptable, but any more than three clicks means the system should be improved. Every click is a distraction from the real task," Kimball said.
He advised that engineers don't become too obsessed with hardware. "Hardware is like tissue paper; you throw it away when you are done with it. It is software, techniques and data structures that are durable," he noted.
And solutions should be built to capitalise on user expertise, while minimising costs, Kimball continued. "Maybe the biggest thing we need to do is reduce the cost of lost opportunities. It is hard to quantify the value of a decision not made."
Xhead: Star schemas and simplicity
Kimball highlighted simplicity and performance as two non-negotiable design essentials, adding that both should be measured by the end-user. "This is challenging because we as data architects are genetically predisposed to make things complicated," he joked.
To tackle a complex problem, it is best to decompose it into pieces and to choose a practical approach, while always keeping the original design in mind, he said.
Businesses should identify key performance indicators from a single source in order to break the data down into manageable chunks, he said, adding that consolidated sources are often tough to manage. Only once you have this single source, can you move down the various layers of BI decision-making tools, he continued, describing this process as the root of the whole project.
He called on businesses to view data in dimensional models, commonly know as star schemas. "We should bring the data up to the business user in a dimensional way," he said, describing star schemas as one of the simplest relational models.
He also encouraged data warehouse engineers to integrate all data models with business processes, using something he terms 'the business matrix'. He defined this as a vehicle of executive communication and planning. The challenge of integration is to make the data relevant across various sources, Kimball continued.
Dimensional design is about conforming dimensions, completing processes and designing consolidated fact tables, among other things, he said. Schemas are the end result of a complex background process but they are also the thing that designers can use to get their jobs done, he pointed out. "It is about drilling across and integrating, rather than just drilling down."