Object technology and the data warehouse are butting heads. MERVYN MOOI, data warehousing consultant at Knowledge Integration Dynamics, outlines the problem and suggests solutions.
Two of today`s hottest topics - data warehousing for business intelligence (MIS/EIS) and object technology - are at odds with each other, and the conflict could have negative implications for both.
It`s instructive to revisit the original premises behind objects and data warehousing. Objects initially offered an easier, quicker, more productive environment in which to develop applications. Data warehousing offered a high-speed, enterprise-wide mechanism for accessing corporate data in a manner that allowed management to make meaningful, informed decisions. At the heart of the problem is a fundamental design and application clash between objects and data warehousing. On the one hand objects are structured in such a way that their data is inherently inaccessible to data warehouses and other business intelligence applications.
Object data along with its associated business logic, processes, procedures, rules and access mechanism is encapsulated and hidden from the application, requiring special messaging to reveal content. In contrast, a data warehouse has structured data, typically in a relational denormalised format and is designed to collate and expose data for analyses. Therefore the data to be stored in warehouses must be readily available.
The key problem is that the data within objects is not readily accessible to data warehouses and other business intelligence applications.
But this object data must be made accessible, as object technology is becoming pervasive. The content of an object may include both its data part and its meta data (definitions, methods, rules, states), depending on what is required for analysis.
One solution that has been identified is to break out, or dismantle, and store the object data separately from its rules or methods in the data warehouse and then to make it available for analysis.
To break up an object may be complex, but it can be done using the same object definition standard, access mechanism and language used to build the object. The "parts" of each object instance and its unique identification can then be "time-stamped stored" in a structured manner in a warehouse.
Profiling objects or analysing object behaviour (states, changes, persistence, accesses or uses) and its data content then becomes a reality. This applies to any object, whether it is a presentation, service or business object.
The problem becomes more complex when unstructured data is to be considered for warehousing. Images for example, take up a lot of storage - unless we have very advanced and super-quick data compression and decompression algorithms supported in traditional database management systems used for warehouses, it may be impractical (and expensive) to store such data within a warehouse (such as BLOBs, or binary large objects).
One must remember that a typical query against structured data in a data warehouse may traverse thousands or even millions of rows of data, only to bring back a series of small result sets to reveal a profile or graphic, for example.
If images, which are stored within or referenced by objects, are to be traversed in same manner (for example, to see the ageing of a person), then it may be necessary not to sample every frame (or instance of the object). If only the final result of a changing unstructured set of object data is to be revealed, then it is only necessary to store its reference, such as its URL or pointer.
Using the object "dismantle" method, data warehousing and objects can coexist, and object data can be interrogated alongside structured data, without compromising performance, and using industry-standard tools.
Editorial contacts

