eNatis disaster, then recovery

By Siyabonga Africa, ITWeb junior journalist
Johannesburg, 26 May 2009

The electronic National Traffic Information System (eNatis) was offline for several hours yesterday, following a disaster recovery exercise.

The system's disaster recovery centre (DRC) was undergoing a routine test in the morning, during which no transactions could be processed.

“People could not perform any transactions, in all the testing stations, from 8am to 9.45am,” says Philip van der Merwe, Tasima divisional manager of corporate communications. “Yet after that, the system was fully operational.”

eNatis was developed and is being maintained by Tasima under a contract awarded by the Department of Transport.

Van der Merwe explains the disaster recovery system eNatis uses is a controlled switchover which entails closing all user sessions, ensuring all data has gone through to the data server, and then switching off the data and application servers.

“This is followed by switching on the application and data servers at the remote disaster recovery site, and redirecting the network to point there. Needless to say, all of this takes time.”

Van der Merwe adds that the DRC houses three application servers with Itanium CPUs running HP Ux Unix and Oracle Application Server, and three database servers with Itanium CPUs running HP Ux Unix and Oracle database management system.

“At some point in the future, we will be looking at a hot switchover system, which will allow for almost instantaneous switching between the data centre and the DRC. Since the DRC will always have to be housed at a remote site, this will necessitate the introduction of a high-speed network linking the two sites. Due to technical and cost restraints, this is not feasible at present.”

Tasima says the DRC is housed at a remote site and serves as full backup to the system's main data centre should the latter become unavailable for whatever reason.

“Yesterday, however, the switchover [to the remove site] from the data centre to the DRC took place from 8.15am to 9.45am. During this time, the system was not available. From this, we can take that in a worst case scenario with a disaster affecting the data centre and more than 2 000 users accessing it, people will only be affected for 90 minutes, after which the system will once again be fully operational.”

The switchover time to a DRC can vary depending on the nature of the business, notes Warren Blackbeard, sales executive, ContinuitySA, which is not involved with eNatis. Blackbeard explains that companies normally decide which processes are the most important to keep running when disaster strikes, which would affect their recovery plans.

Van der Merwe says the DRC is geared to meet increased demand as estimated for the next five years, with spare capacity of around 20% to 30%. He adds this will ensure the DRC is at all times capable of handling a complete system switchover.

Related stories:
eNatis collects its share
eNatis claims success
Fewer transactions for eNatis
eNatis is not to blame
eNatis tops 14m transactions
eNatis shows progress