Squaring up to cubes

By Charl Barnard, GM of business intelligence at Knowledge Integration Dynamics

Johannesburg, 09 Jun 2004

Cube analysis delivers the simplest form of analysis, allowing anybody to analyse data. It is used most often by power users and managers who have a deep interest in understanding the root causes underlying the data in reports, but who lack skills for full ad hoc investigation of databases.

Cube analysis lets people flip through a series of report views, using the now standard OLAP features of: page-by, pivot, sort, filter, and drill up/down. These OLAP features, which were first introduced in the early 1990s, allow users to slice and dice a cube of data, or analysis cube, using simple mouse-clicks.

The term "cube" refers to a subset of highly inter-related data that is pre-organised to allow users to combine any attributes in the cube (stores, products, customers, suppliers) with any metrics in the cube (sales, profit, units, age) to create various two-dimensional views, or slices, that can be displayed on a computer screen.

To implement cube analysis functionality, most OLAP vendors use custom-made proprietary cube databases. This technique is known as multidimensional OLAP or MOLAP. Unfortunately, the cube databases have very small data capacities - less than 0.01% of real relational databases - because they lack the technical underpinnings of real relational databases. Nonetheless, this capacity limitation was not initially perceived as a problem because most early departmental BI applications only needed between 10MB and 100MB of detailed and summary data. Problems due to limited cube data capacities began occurring when companies found they needed to deploy hundreds of overlapping cube databases to cover all the combinations of data subsets, summarisation levels, and security privileges for different user groups across multiple applications.

These ever-growing collections of cubes have become known as "cube farms". Cube farms impose an immense burden on the IT groups that have to generate the cubes, pre-calculate the summarisations, distribute them to users, and retire them when their data become outdated.

By contrast, modelling the relational database as a "virtual multidimensional cube" with a technique known as relational OLAP or ROLAP, gives users the same OLAP functionality of page-by, pivot, sort, filter and drill, but can do so against the entire relational database. With ROLAP, the data is always the very latest data - there is no limitation of what data can be analysed, and all users and security work uniformly against the database. The trade-off that early ROLAP users paid for the vastly greater range of data access was somewhat slower response times and the potential for overwhelming novice users by allowing them to analyse the entire database, rather than a simple subset.

Analytic flexibility must be placed in the hands of business users - this frees up the IT department from creating and maintaining the cube data.
Charl Barnard, GM, MicroStrategy

Enter intelligent cubes: these provide the same OLAP functionality that small-scale MOLAP cubes provide, but with significant enhancements available only with a ROLAP underlying architecture:

1. Speed-of-thought report analysis and manipulations - Analysis of cubes with speed-of thought performance and powerful slice-and-dice capabilities.

Intelligent cube technology lets users perform report manipulations on a multi-dimensional cache of data rather than a limited, proprietary cube database. These caches are instantly populated by the simple action of "running a cube" and remain in shared cache for as long as the data is valid and people are using it. This eliminates the entire IT overhead imposed by managing cube farms.

2. Ad hoc drilling from summary data to transactional details - Seamless drilling capabilities outside the cube domain to anywhere in the data warehouse.

Executives and managers can quickly assess summary-level KPIs on scorecards or other reports before drilling to analyse detailed and even transaction-level information. This drilling from KPIs to detailed level information generates the most valuable insight - for example, finding the root problem of a specific product`s declining sales.

3. Cube sharing with personalised views and security - Transparent and secure sharing across the organisation of cubes with personalised views.

IT administrators struggle with maintaining and managing multiple cubes with duplicated data across user desktops, Web servers, file servers and other locations. The architectures of diverse BI software spread data across multiple systems within the organisation. Consequently, users are at risk of analysing outdated data on their desktops and using different analytical definitions of the same business terms due to decentralisation of information across physical cubes of data. Thus, cube-based BI users cannot reliably share insights because dissimilar numbers can show up on the same report definitions. Intelligent cubes easily share data with centralised metadata and server architecture, providing personalised, secure views.

4. Automatic creation and synchronisation of cubes - Creating cubes on the fly with automatic refresh of data for real-time analysis.

IT administrators spend most of their time managing and updating cubes based on business users` requirements. In those cases, administrators maintain multiple cubes in different locations, such as user desktops, Web servers and file servers. Cube data must be duplicated across these various locations and can quickly become a nightmare to manage and keep synchronised with the latest data. By contrast, creating and populating automatically managed intelligent cubes in the centralised metadata for use across the entire organisation means cube data is always up-to-date and synchronised - without involving administrators.

Placing analytic flexibility where it belongs

While traditional MOLAP is often sufficient for limited departmental analysis domains, it fails to provide the power and flexibility required for true speed-of-thought analysis. Analytic flexibility must be placed in the hands of business users - this frees up the IT department from creating and maintaining the cube data. It lets users quickly add or remove report objects, modify report-filtering criteria, and create new metrics - all with easy drag-and-drop actions from any Web browser, and also allows users to perform report manipulations on cached data without accessing the database.

This analysis environment is simple enough for novice users, yet powerful enough for power users wishing to perform transaction-level analysis. In turn, IT administrators` workloads are minimised by eliminating the need to manage multiple cubes for different types of users, with varying security privileges and analysis environments. Intelligent cubes can be built once and shared across the organisation using centralised metadata.

Intelligent cubes do not inherit the disadvantages of pure MOLAP, cube-based BI tools. Cube-based BI tools offer limited data scalability, limited scope of analysis, and poor drill-through to transaction-level data. Companies can provide users with quick performance and full analytical power, while retaining the ability to report against terabytes of data at transaction-level detail.

Cube-based architecture forces IT administrators to pre-calculate and pre-build the cubes prior to end-user access, resulting in cube storage in multiple, unlinked locations, requiring duplication of data for each location. Intelligent cubes are created automatically without IT assistance when users run a report - and are stored in the unified metadata on the centralised server. This saves IT administrators` time and hardware costs. Organisations can consolidate separate cube, reporting and analysis systems into one integrated, scalable platform, and can distribute report and intelligent cube creation to power users, eliminating unnecessary maintenance work.

Administrators no longer have to spend substantial amounts of time building and administering cubes because they must manually create, expire and refresh cubes constantly - intelligent cubes are automatically expired and refreshed on event- and time-based schedules.

Squaring up to cubes

While traditional multidimensional online analytical processing is often sufficient for limited departmental analysis domains, it fails to provide the power and flexibility required for true speed-of-thought analysis.

Placing analytic flexibility where it belongs