Modelling data a vital discipline

By Michael de Andrade, MD of EnterpriseWorx.

Johannesburg, 16 Oct 2009

In the previous Industry Insight, I looked at some of the deadly sins of database design. Now I will consider data modelling.

It is widely accepted - but not universally applied - that modelling an entity before creating it is an excellent discipline. It reduces risk, increases the chances of success, and captures corporate knowledge in a way that can be passed down to future generations.

Very often, people are so keen to get going with prototyping, deployment or manufacture of an entity that they skip the modelling aspect. This almost always delivers negative consequences.

Data modelling is the process of exploring data-oriented structures, and data models can be used for a number of functions, from high-level conceptual models to physical data models.

In practice, most database administrators will come across three general styles of data model:

* Conceptual data models: Also known as domain models, they are used to interrogate domain issues with the stakeholders of a project. Conceptual data models are frequently created as a precursor to logical data models, or instead of them. Given that agile development is growing in popularity, it's useful to note that high-level conceptual data models are frequently built in the initial requirements envisioning and used to explore business structures and concepts.

* Logical data models: These are used to implement the conceptual data models, and the implementation of one conceptual data model may require several logical data models. They define the logical entity types, the data attributes that describe those entities, and their inter-relationships. When it comes to agile projects, logical data models are seldom used.

* Physical data models: These are employed to help design the internal scheme of a database, showing the data tables, the columns in the tables, and relationships between the tables. Physical data models add a great deal of value to both agile and traditional software development teams. They are widely used and, in many senses, indispensable.

A physical data model builds significantly on any work done with a logical data model. It ideally reflects a company's database naming standards, data types for columns, along with lookup tables (also known as reference/description tables), which stipulate how an address is to be used and rules for listing cities, provinces and countries.

Significant job

Data models have an important role to play in the enterprise space, and on projects. This is especially true when it comes to enterprise architecture. Enterprise architects tend to build one or more high-level logical data models that represent the data structures that the organisation uses. These models will be referred to as enterprise data or information models.

People are so keen to get going with prototyping, deployment or manufacture of an entity that they skip the modelling aspect.
Michael de Andrade is MD of EnterpriseWorx.

These logical data models form a critical component of any enterprise architecture, as data forms the third layer and information the fourth of the enterprise architecture stack (business, information, data, application, technology). Each of the other layers is explicitly modelled, which makes it vital that the data layer is also modelled. This can be modelled using tools such as Rational, Microsoft Visio or Aris.

Apart from anything, enterprise data models yield information a project team can use as a set of constraints while delivering important insights into their system's structure.

The deployment of a relational database for either transactional applications or business intelligence will always imply the application of a physical data model - it is a critical design artefact for any software development project.

There are a number of steps - non-negotiable - to follow when modelling data:

* Identify entity types
* Identify attributes
* Apply naming conventions
* Identify relationships
* Apply data model patterns
* Assign keys
* Normalise to reduce data redundancy
* Denormalise to improve performance
* In the next Industry Insight in this series, I will look at each of these steps in greater detail.

* Michael de Andrade is MD of EnterpriseWorx.

Modelling data a vital discipline

Modelling an entity before creating it reduces risk and captures corporate knowledge.

Significant job