Subscribe

The life and times of data

Not content to leave data where it lands, storage vendors are now talking about lifecycle data management - caring for data from swaddling clothes to ultimate demise.
By Jason Norwood-Young, Contributor
Johannesburg, 23 Oct 2003

Continually increasing storage capacity is a necessity. To not put too fine a point on it, any business that doesn`t need more storage is dead, because that`s the only time it will stop producing data.

This is why storage vendors have such large grins - business cannot survive without their gigabytes of disk platters, kilometres of tape and an ever-mounting pile of CD-ROMs.

"I think there are a number of drivers for storage growth - both business drivers and statutory drivers," says Dave Reddy, MD of Veritas SA. "The King report is demanding better record-keeping. The Enron disaster showed how paperwork can be lost and destroyed. Businesses are also starting to use more customer relationship management and business intelligence software - the more historical data that you have, the more trends you can see."

There is so much data out there, on so many storage devices, that most vendors have changed their focus from providing storage hardware to developing software and consulting services to manage it all.

If you`ve deployed different devices from different vendors across the hierarchy it`s difficult - you end up with three or four sets of management tools that don`t mesh.

Fanie van Rensburg, MD, Shoden

Take EMC, for example. This steadfast supplier of high-end storage solutions spent 70% of its $1 billion R&D budget not on creating faster backplanes and bigger drive capacities, but on software.

"Software is the strategic direction of the company," says EMC country manager Frank Touwen, whose company recently bought software house Legato for $1.7 billion. It also signed up BMC as a partner and bought BMC`s storage management software, and even patched up relations with Veritas earlier this year after an 18-month fall-out sparked by EMC`s attack on Veritas` storage software revenue.

The press has speculated that EMC`s aggressive foray into the softer side of storage is a nefarious ploy to buy a bigger customer base, and sell more hardware to those running EMC`s management software. EMC counters that software and services are a revenue stream all on their own, and hopes to get 30% of its income from software and services in the short-term, and 50% in the long-term.

Boxes look the same

This drive to storage software by EMC and many of its competitors is motivated primarily by the rapid commoditisation of storage hardware.

<B>Case study: Kumba Resources</B>

Problem: Consolidating storage resources, while migrating from a Novell to Microsoft environment.
Solution: Network-attached storage (NAS) for the smaller sites, and a storage area network with NAS components at its Grootegeluk coal mine in Ellisras.
Solution providers: Datacentrix, HP, Microsoft
Cost: R1.5 million
Kumba selected HP StorageWorks NAS B3000 as a central repository for all production.
"At this stage the smaller sites, including our Thabazimbi and Rosh Pinah mines, have been rolled out, with the larger sites such as Grootgeluk, Empangeni and the Kumba head office in Pretoria undergoing implementations," says Chris Smith, senior technical architect at Kumba.
An HP SuperDLT tape library has also been installed at the Kumba head office, with the company improving backup times from a full night to four hours.

"There is still a significant amount of differentiation on hardware," says Fanie van Rensburg, MD of Shoden, Hitachi`s local distributor and reseller. "Hardware is fairly standardised but there still is performance and reliability differentiation between the competitors.

"But storage is becoming more and more of a commodity. The question soon will not be about hardware differentiation, but about what value you can bring to an organisation. Advantage will be created by understanding your customers` requirements and reacting with agility to those requirements. We will need to bring value-add to an organisation when supplying commodity storage."

The ubiquitous open-standards demands from customers are generally to blame for turning enterprise storage into a commodity. Today, the management consoles, fibre channel switches and storage subsystems aren`t the vendor-locked proprietary beasts of yesteryear, which means heterogeneous systems - which have been a reality for some time - are now practical.

The cost of storage in terms of hardware and software management is huge; 70% of storage total cost of ownership is ongoing management.

Tim Knowles, CEO, Stortech

Fortunately for storage vendors - and unfortunately for customers - storage has been slow to standardise (the famous case of the delayed fibre channel switch standards put storage area networks back a good year), but the process is picking up pace.

"I don`t think standardisation will happen very quickly, but it will happen eventually," says Van Rensburg. "Storage is viewed by vendors more and more as a commodity. That`s why some of the vendors are trying to lock-in customers on their management software. At the end of the day, that cannot be healthy for the industry."

Singing in tune

While the industry is slow on agreeing on standards, it is fast on picking up the Next Big Thing. The vendors are singing a single tune in chorus: that of lifecycle data management, a.k.a. information lifecycle management.

"In every environment, you don`t have information residing in the most cost-effective place. Anything from 30% to 70% of information is in the wrong place, and could be moved to a more cost-effective platform," explains Tim Knowles, CEO of Stortech.

Stortech`s Leon Leibach continues: "The first step is to identify the value of the data as a business driver, so that we can assist customers to find the most cost-effective way to store that data. For example, e-mail can sit on a slower medium [such as ATA drives], freeing up high-end storage for business-critical applications. Information lifecycle management is all about classifying what is important to the customer, and what needs high availability."

By categorising data, one can "pigeon hole" it to the cheapest medium that will still ensure the access times you need from that data. Over time, the data moves down to less and less expensive storage mediums, until, finally, it is destroyed.

<B>The value of data</B>

According to a paper by Fred Moore, president of Horison Information Strategies, data can be characterised as mission-critical, vital, sensitive or non-critical.
Mission-critical: Critical data is used in the key business processes and can account for up to 15% of stored online data. Losing access to this data means loss of revenue and the survival of the business is at risk. This data is best suited for disk mirroring as instantaneous recovery is mandatory. Critical data is normally company secret.
Vital: Vital data is used in normal business processes, but doesn`t mandate instantaneous recovery in order for the business to remain in operation. Vital data is normally backed up using automated tape libraries and is often company secret.
Sensitive: Sensitive data is used in normal business operations and alternative sources for accessing or reconstructing the data in case of data loss.
Non-critical: Non-critical data represents the largest category of data, has low security requirements and duplicate copies often exist. Lost, corrupted or damaged data can be reconstructed with minimal effort and cost. E-mail archives often fit this profile. Source: Horison Information Strategies

The big-iron boys would call this cycle hierarchical storage management. But there is a difference between the architecture of yore and today`s information lifecycle management, according to EMC`s Touwen. "In hierarchical storage management, information cascades down when it gets older. Lifecycle information management can flow both ways."

Touwen gives the example of a record of an x-ray in a hospital: after a few weeks the record gets dropped from the high-speed SCSI drives to a slower storage medium, but in a year`s time the patient comes back in for surgery and that x-ray becomes important again. So it is moved back into the SCSI library for fast access.

Behind this call for a lifecycle management system are the harsh realities of doing business in a not-too-honest or safe environment. Shareholders aren`t the only ones cursing Enron - CIOs have a right to rage at its fraudulent schemes and pathetic record-keeping abilities. Enron and its ilk have created a great deal of pressure from bourses around the world for better record-keeping. Government is also demanding better records and longer data lifecycles, not to mention the Reserve Bank and various international business standards bodies.

With IT budgets stagnant, the continuing increase of long-term data demands a rethink of how that data is stored in a cost-effective way. But information lifecycle management can be a lot of administration in itself, thereby raising the cost of data storage, rather than lowering it.

"You need to automate it so that you don`t require a lot of manual intervention," says Stortech`s Knowles. "Those tools are improving, but not at the rate we`d like them to be improving. Yet productivity tools that assist in being able to do information lifecycle management cost-effectively are available."

Getting IT together

Information lifecycle management is seen as a natural extension of storage consolidation, which has staked its spot in the IT landscape as a serious trend. Most enterprise in the country is reported to be somewhere in the consolidation drive, and real cost savings are being reported.

Tape is the worst media...it is very slow and there is often degradation...

Frank Touwen, country manager, EMC

"Consolidation of storage is one of the main drivers in the storage industry at the moment," agrees Vic Booysen, product manager for enterprise storage at Persetel Q Vector. "Direct-attached storage (DAS) holds lots of potential business `stoppers` for users and a strong movement towards storage area network (SAN) and network-attached storage (NAS) solution implementation can be seen. The combined annual growth rate in DAS is predicted to slow down to an estimated 5% per annum versus the almost 75% in NAS and 140% in SAN. With the implementation of SAN and NAS solutions, the implementation of previously neglected disaster recovery and business continuity solutions are also simplified. Backups during production periods can also now take place, whereby the production times can be stretched to allow for `forever` running applications."

Disaster recovery is a key concept in today`s storage environment. And companies aren`t happy to just replicate two servers next to each other anymore - the World Trade Centre disaster put paid to that. EMC`s Touwen tells of one of the company`s customers in the World Trade Centre that mirrored its data off-site... in the second tower. Today, enterprises are looking at replicating not only across countries, but internationally. There`s even a project under way to set up a disaster recovery storage system on the moon!

Continue, no matter what

"We don`t have a choice in providing guarantees for business continuance," says Shoden`s Van Rensburg. "It`s imperative we get to the point where we guarantee the business would survive any possible requirement. The Reserve Bank requirements are getting more and more strict and prescriptive in this regard. Government is looking at business continuance, and is investigating the possibility of a huge shared infrastructure to do so."

<B>Case study: Department of Trade and Industry</B>

Problem: The need to store information such as company registrations and trade and industry data over a prolonged period of time, plus a huge e-mail volume.
Solution: Tape library, SAN and storage management software.
Solution providers: Datacentrix, HP, ADIC
Cost: R4.8 million
The ADIC (Advanced Digital Information Corporation) Scalar 1000 tape libraries and HP SAN switching equipment and data management software has been installed at two separate sites in Pretoria - the Civic Centre and the Company and Intellectual Property Registration Office. The major benefit of this type of consolidation of the DTI`s data repository is that it makes it easier for the department to manage data. Previously the department used different servers and this made the administration process more difficult. "The ADIC and HP equipment was selected for its reliability and proven value for money," says Flip Swart, DTI system support.

Van Rensburg cautions those investing in lifecycle data management that disaster recovery is also important for older data sitting on less expensive drives.

"Lifecycle data management brings with it some hidden flaws. You have to archive onto low-cost devices, but it has to be secure and reliable. We`ve seen some cases where customers have used low cost devices, but omitted to ensure they are secure and reliable. If you`re keeping data for five years, particularly on disk, a lot can go wrong. Low cost may mean there`s not sufficient redundancy. Redundancy is still the most important factor. Just going for low cost is inherently flawed.

"If I can`t store it safely on low cost storage, I`d rather put it on tape. Why archive something you will not be able to read?"

Van Rensburg`s trust in tape is offset by his competitor`s opinion: "Tape is the worst media you`ve got!" objects EMC`s Touwen. "When it comes to restoring it, it is very slow and there`s often degradation on the tape."

The adage "In tape we trust" remains a hot topic in the storage industry, with opinions typically being diametrically opposed. The low cost of hard drives means the drive is a possible backup medium, but the line between online drives and offline tape is still quite clearly visible. "I don`t believe that people see hard disk drives as an acceptable form of backup," says Knowles. "It is a mechanical device - of course there are many more things that can go wrong as opposed to tape where it`s just the medium."

For the time being, tape remains the backup medium of choice, while low cost drives are finding a new life as the stopgap between high-end storage and tape.

Another old technology is also seeing a revival: Worm (write-once-read-many) devices, such as CDs, are becoming popular as they are seen as a legal snapshot of the data, unchanged since the point of copy.

Backbiting, in-fighting

While all these technologies are emerging from the catalyst of lifecycle data management, there is a warning knell: lifecycle data management isn`t quite ready yet.

<B>Case study: Hannover-Reinsurance Africa Limited</B>

Problem: An ageing backup system was unable to perform scheduling tasks, offered no management of backup tapes and was too slow and cumbersome. A major hard drive crash motivated management to install a new backup system.
Solution: Backup management software (Veritas Backup Exec) to manage the entire archival process.
Solution providers: Veritas, ADIC, StorTech, Qlogic
Cost: Unknown
"When we had the hard drive crash it was discovered that instead of the system performing full backups, it was only doing incremental backups, causing major recovery problems. This was the turning point and I was given the green light to find a solution that would satisfy our needs for at least the next five years," says Hannover-Reinsurance IT manager Bobby Sarlie.
Performance improvements since the implementation of the Veritas Backup Exec system have been dramatic. The combination of the software, tape library and management has removed the responsibilities of operational staff that now rely solely on the Veritas software solution to manage the entire backing up process.

While there are many tools available for managing data through the various systems, and even for autonomously moving data to different devices as needed, there is still a quiet war raging to determine who will own the management for this emerging architecture. The storage hardware vendors believe they`re the best choice, the switching vendors are also in the running, and third-party storage software providers make a good vendor-agnostic argument.

"A lot of pieces - but not all - are there for lifecycle storage management. Every month there are more announcements," says Touwen.

"There are tools for lifecycle data management, and it`s becoming more important that they work across the spectrum. If you`ve deployed devices from different vendors across the hierarchy it`s even more difficult - you end up with three or four sets of management tools that don`t mesh," says Van Rensburg.

He continues: "The ideal would be one set of storage management tools, with all your storage managed under that umbrella. That`s where everyone wants to get to. Standards would be ideal, but in the practical world it`s not happening. Because of software providers like Veritas, you may find hardware vendors more willing to work together."

<B>Case study: Absa Bank Western Cape</B>

Problem: High cost of storing documents and slow client query turnaround.
Solution: Scan-on-demand document retrieval and delivery system.
Solution provider: Metrofile
Cost: Unknown
Absa Bank has shortened the client query turnaround time at its Western Cape group administration division with the implementation of a scan-on-demand document retrieval and delivery system from document and records management specialist Metrofile.
The service has replaced Absa`s in-house storage system, which filed and stored documents in premises next to the division`s Cape Town Foreshore offices, thereby eliminating the cost of rented space and time spent physically searching for, photocopying and faxing documents to branches.
Scan-on-demand has not only reduced the cost of rental space but improved efficiencies and enhanced productivity by simplifying Absa`s document retrieval process, says Gail Henry, manager for client service group administration, Absa Western Cape.

Says Knowles: "I think what you`re going to find is standards. What I see is a meeting of the minds - an umbrella management software solution with software vendors meeting hardware manufacturers who will apply more logic and tools to make their individual solutions more manageable and transparent.

"I think the software vendors can take a lot of complexity out of the architecture. The cost of storage in terms of hardware and software management is huge; 70% of storage total cost of ownership is ongoing management."

And until the hardware and software vendors agree to put away their differences and focus on the end goal, real storage cost reduction through lifecycle data management will be about as realistic as replicating data on the moon.

Share