Subscribe

Preparing for a 'big data' future

Kirsten Doyle
By Kirsten Doyle, ITWeb contributor.
Johannesburg, 24 May 2016

Although big data has some success stories - specifically in human genome research, the restaurant industry and the healthcare sector, there are few examples of companies that have successfully transitioned big data applications into their mainstream IT operations processes in order to make it a core part of their business.

Mark Lewis, CEO of Formation Data Systems, says in an attempt to harness the power of big data, IT executives have had to use the limited technological options available to them today. "Many have had to roll out analytic programs using both legacy infrastructure and non-traditional direct attached storage (DAS) architectures.

"Traditional infrastructure is unable to support the explosion of data present in these massively parallel compute environments, especially with the adoption of Hadoop ecosystems and NOSQL distributed databases. Conversely, while DAS provides a simple storage infrastructure, these deployments have seen limited success because they lack modern data management tools."

He says without these tools, there is no way to apply data governance and protection policies to big data initiatives, making security, compliance and data protection extremely challenging at this scale.

According to Lewis, to properly prepare for the upcoming 'big data' future, including building, supporting and deploying mainstream big data applications, businesses should first perform a reality check.

"This is primarily to ensure that their infrastructure can support their overall business strategy, specifically as it relates to enabling any big data initiatives. Many businesses discover while planning their application modernisation projects that infrastructure built for legacy client/server apps can't support the scale and speed presented by distributed, data-centric applications."

Software-defined architectures

He says this conclusion has led many early adopters to deploy modern, software-defined architectures to address these needs. "Big data is no different, because many of these applications are built along the same architectural principals, so deploying software-defined storage to support modern applications offers a very solid foundational component for big data as well."

Businesses find it difficult to unlock the potential of big data for four main reasons, says Lewis. "They are unable to build a DevOps model, or reliable applications. They are unable to manage their data. Their data is trapped - updates are difficult and expensive, and putting data directly onto storage within a database or HDFS sounds great until you that data is needed for another purpose or application."

To overcome these obstacles, he says enterprises need to build a data-centric model that provides an agile, elastic infrastructure that allows for resources to be provided on-demand, then returned to the pool when completed.

"Some of the more recent distributed software-defined storage solutions utilise this data-centric model, but be aware that not-all software-defined storage solutions are created equal. The key is having a virtual data layer between all data consumers (databases, applications, data services) and the data providers, (flash, disk and suchlike). This virtual data layer lets you provision, manage, replicate, control performance and even share data across applications in a universal way that is transparent to the underlying hardware as well as the applications."

Seamless operation

With this, says Lewis, moving data between applications and adding new databases and other data services becomes a seamless operation that can be performed in an automated fashion, without impacting production operations or requiring an expensive professional services engagement.

"The overall result is a data-centric software-defined architecture that is scalable and economical as enterprise data continues to grow. It will save businesses money, while making sure they are not foregoing security and data governance. Especially in the case of big data, addressing data management is not necessarily the lowest cost option, but it will wind up costing businesses more in the long run if they don't adopt this technology now."

With this intelligent, software-defined infrastructure in place, companies will be able to avoid the high-costs of legacy storage infrastructure, and instead use commodity storage hardware, he explains. "In addition, they will be able to secure a unified, transparent data management system for controlling performance and for provisioning, managing, replicating and sharing data, as well as scale the system by adding new databases and other data services as their big data applications grow."

Share