About
Subscribe

Aim for the long haul with metadata

Metadata can make data warehouse management more manageable, but just as clean data is critical in data warehouse creation, accurate metadata management is a make or break aspect in the long-term success of a data-warehousing project.
Julian Field
By Julian Field, MD of CenterField Software
Johannesburg, 26 Sept 2002

By now we are all sick and tired of the saying "the only constant is change". Our industry has been through some of the most dramatic changes imaginable - just think back to what was important to us in 1992, only 10 years ago. Enterprise warehouses, by their nature, are also highly dynamic as they keep changing to keep pace with evolving business (and especially e-business) needs.

Over time, the data stored in a warehouse inevitably becomes outdated or changes in various ways, and as a result the reporting and analysis generated from the warehouse becomes less accurate and reliable. Users find themselves groping for answers to such questions as: What does the data mean? What is its structure and format? Where did it come from? When was it loaded? Who owns it? Where is it used?

These seem to be simple questions, but I know a few data warehouse managers who would happily break their budgets and thumb their noses at the board if they could guarantee their warehouses would repeatedly provide accurate information of this type. When properly managed, metadata enables users to find answers.

When properly managed, metadata enables users to find answers.

Julian Field, GM, Ascential Software SA

Metadata, which we loosely define as "data about data", is generated whenever data is added to, deleted from, modified in, or moved to or from a data warehouse.

The best analogy I have found is that of the label on a can of baked beans. You can`t see what`s inside, but you can read the label, which will itemise the contents in terms of weight, attributes, date of and dietary analysis.

Metadata can generally be found throughout the enterprise in a patchwork of repositories and proprietary metadata stores. The trick is to ensure that metadata stores and management applications across the enterprise speak the same language and can communicate seamlessly with each other.

We generally find two broad categories of metadata in the warehouse environment. Technical metadata provides a detailed blueprint IT can use to build and maintain the warehouse, including database implementation names, table and column sizes, data types and structural information such as database key attributes and indices. Business metadata includes those descriptions of data that are not related to software implementations - for example, the business name, business rules in relation to other data and the owner of the definition. Business metadata gives users a roadmap for navigating all of the data in the enterprise by documenting what information is available in the warehouse and, when accessed, it provides a context for interpreting the data.

Although there isn`t much hype about the unglamorous topic of metadata, it`s difficult to overstate the advantages of managing it. The four primary benefits that immediately come to mind are:

* Consistency of definitions: One department refers to "revenues," another to "sales". Are they talking about the same activity? One subsidiary unit talks about "customers", another about "users" or "clients". Are these different classifications or different terms for the same classification? Effective metadata management can ensure that the same data language applies throughout the organisation. Here, a good analogy is that of a salesperson validating their sales figures by being able to point back to the sources of the data, should they be challenged in a corporate presentation.

* Clarity of relationships: Metadata management illuminates the associations and interactions among all components of the warehouse environment: business rules, tables, columns, transformations and user views of the data, to name a few. By clarifying relationships throughout the data warehouse environment, managed metadata enables warehouse managers and knowledge workers to see the bigger picture - to fully understand the meanings of data assets, and to accurately predict and manage the impact of changes.

* Availability of information: Metadata exists behind the scenes, revealing the origin of data, who defined it, when it was modified and more. Traditionally hidden, metadata must now be made visible to company knowledge workers on demand. Fortunately, standards and technologies such as XML and the Web create a perfect for delivery.

* Impact analysis: If someone wants to make a change to an enterprise resource planning or bespoke system, they can subsequently perform an analysis and follow an audit trail to see what systems have been impacted.

Rattling off the benefits of metadata management may convince you of the necessity of maintaining and managing it, but if it was easy we would all have implemented a management process years ago. One problem we face is that when building or extending a data warehouse, developers employ a variety of tools for system modelling, database design, data quality assessment, data movement, scheduling, analysis and reporting. The need for close collaboration among these tools is critical for business intelligence purposes.

A solution I have found to be most effective is a translation service that permits data to be shared among the warehouse tools suite. In addition, the ultimate metadata management tool enhances data availability by offering the capability to publish metadata out to the Web and it will also be tightly integrated with the enterprise`s business intelligence reporting and analytical tools. These enhanced capabilities will yield the consistency of definitions, clarity of relationships and availability of information that are unavailable through current approaches.

The bottom line is that as data warehouses and data marts continue to proliferate, businesses need metadata management more than ever to function effectively. They also need to be able to maintain metadata throughout warehouse development and deployment, and make it accessible by any number of tools used to create and maintain the business intelligence infrastructure.

A tall order to be sure, but with planning and foresight we can make sure our data warehouses deliver the expected ROI - now and in the future.

Share