The role of journaling in high uptime

By Raul Garbini, Sales director at Edgetec.

Johannesburg, 08 Jan 2009

In the previous Industry Insight in this series, I identified the four components of high availability. In this Industry Insight, I look at data replication and remote journaling in greater detail.

Data replication engine

Once communication is established between systems, the next component needed is an engine that replicates or mirrors transactions between the production and the backup machines, and does it as closely to real-time as possible. The ideal is for the data replication engine in the high availability (HA) solution to use the journaling function of the operating system, to monitor for changes to data and move those changes to the backup environment.

All HA solutions harvest journal entries during data replication; however, HA solutions harvest the journal entries from either the production or the backup system. HA solutions that harvest from the production system use their own proprietary process to harvest and send these journal entries to the backup system. HA solutions that harvest from the backup system use a process called remote journaling, which takes care of the transmitting of journal entries from production to backup system.

Before beginning an ongoing replication process, the objects to be replicated need to be copied first to the backup system. So, if the intention is to mirror ERP data to a backup system, a company needs to make a copy of all of the application's data file objects, and restore them on the backup system to establish a baseline. If a user intends to run applications on the backup system in the event of a system failure, or during maintenance, they will also need a current copy of all of the application's objects on the backup machine.

Remote journaling in data replication engine

Remote journaling transmits and writes - at very high speeds - an identical copy of a journal entry to a duplicate journal receiver on another connected system. When used as the engine for replicating data in an HA solution, remote journaling works efficiently, since this process occurs at the level of the operating system: beneath the machine interface.

In contrast, a replication engine that harvests journal entries on the production system must have its own process to transmit the journal entries, which typically happens less efficiently, because several processes need to occur outside of the operating system.

As changes are made to application data, journaling detects these changes on the source (production) system and as journal entries are made, remote journaling automatically replicates and transmits each journal entry to an identical journal receiver on the target system (backup).

Once the journal entry lands in the journal receiver on the target system, a process within the HA software harvests the journal entry, validates the data, and then applies the changes to the data on the target system, thus bringing it current with the source system.

A closer look

When journaling is enabled for an object, a user essentially initiates a process that “watches” the object
Raul Garbini is director of Edgetec.

When journaling is enabled for an object, a user essentially initiates a process that “watches” the object. Journaling consists of two objects: the journal and the journal receiver. When any change occurs to the object that the journal is watching, the journal writes everything about this change in an efficient way in the journal receiver. Each change recorded is called a journal entry.

As journal entries pile up in the journal receiver, once the receiver has a predetermined number of journal entries, the journal receiver is changed and a new, empty journal receiver is then associated with the journal.

One of the main reasons the journal receiver is changed is to make groups of journaled data available to be saved offline for later restoration, if needed.

The HA replication process uses journaling in a different way by sending journal entries to the backup system, which are applied as quickly as possible to duplicate copies of the objects to keep them current with the production system.

Real-time mirroring of changes to objects by any kind of logical HA solution can only be done if the object can be journaled. Currently, this includes data files, IFS, data areas and data queues.

However, an HA solution must be able to keep other system-critical objects updated on the backup system, including: program objects, spool files, user profiles, and device configurations. Typically, these kinds of objects are replicated using an object monitor-and-copy process.

An HA solution must not only mirror objects that can be journaled, but must also be able to detect and replicate changes to non-journaled objects.

One final thought: a user will need to choose between synchronous and asynchronous remote journaling. Synchronous is more current as the production machine waits for confirmation of receipt of the transaction from the target system before committing the transaction to the database on the source. The disadvantage, though, is that it slows down the production machine because it waits for that confirmation.

* In the next Industry Insight in this series, I'll look in detail at system monitoring and role swapping.

* Raul Garbini is director of Edgetec.

The role of journaling in high uptime

Remote journaling transmits and writes an identical copy of a journal entry to a duplicate journal receiver on another connected system.

Data replication engine

Remote journaling in data replication engine

A closer look