Subscribe

Traditional data structures fail to scale

Admire Moyo
By Admire Moyo, ITWeb's news editor.
Johannesburg, 25 Jun 2015
Though data storage has grown incrementally cheaper, humans are also generating more of it, says SumAll.
Though data storage has grown incrementally cheaper, humans are also generating more of it, says SumAll.

As data continues to proliferate, traditional data structures like SQL will not scale.

That's according Korey Lee, a co-founder of data analytics platform SumAll, who notes the velocity at which humans are creating data is unprecedented and continues to grow exponentially.

"Though data storage has grown incrementally cheaper, we are also generating more and more of it," he says.

Market analyst firm IDC says by 2020, the data we create and copy annually will reach 44 zettabytes, or 44 trillion gigabytes.

"Never before have we had the capacity, technology and computing power to store, analyse and perhaps, most importantly, influence decisions with big data. I believe the importance of data will only continue to grow in the years to come," says Lee.

Data age

Shane Moodley, Entelect team lead, data solutions, says throughout history, there have been several eras of quantum leap innovations made by man.

"From the stone age, bronze and iron ages, all have made their mark in our history and made their own contributions to our future. Today, with the advancement of technology and the emergence of the information highway, we are now living in what is called the data age."

According to Moodley, the reasoning for this is that as each millionth of a second passes, almost everything we see, hear and touch generates data. The use of PCs, mobile handsets, tablets, GPS devices, servers and sensors attached to vehicles, buildings and satellites leads to huge amounts of data being stored across multiple databases and data stores worldwide.

The speed at which this data is generated is increasing rapidly and so is the volume and array of different structured and unstructured data types - commonly referred to as big data, he explains.

"In theory, big data means having an abundance of data at your disposal. However, in practice, this data is useless if businesses cannot apply analytics to benefit and gain insight from it. Businesses must find a 'mechanism' that can perform collaborative filtering to 'net' vital information and trends and to answer key questions."

SQL vs NopSQL

With traditional data structures not scaling, Lee points out there are several NoSQL database solutions on the market now.

A NoSQL, often interpreted as Not only SQL, database provides a mechanism for storage and retrieval of data that is modelled in means other than the tabular relations used in relational databases.

Alex Kodat, senior product architect at Rocket Software, by making the physical data explicitly hierarchical, NoSQL ensures data usually retrieved at the same time will all be found close together.

In old-school, non-relational mainframe databases, this meant the data was often on the same physical block and the difference between doing five I/Os to display a screen versus 20 was critical, Kodat explains.

In the newer NoSQL databases, the difference in I/O counts is less significant than the fact that all the data associated with a request can be found on a single machine in a highly parallel, multi-machine cluster, he adds.

Before digging

Faced with the ever-increasing volumes of data, Lee suggests before digging into data, an organisation needs to understand what its goals and objectives are.

"This will inform how to intelligently sift through and analyse the underlying data, or perhaps direct the organisation on how to gather the necessary data to achieve their goals."

He notes data storage, infrastructure, data cleaning/gathering, and privacy are the biggest challenges organisations are facing.

Thus, he urges companies to monitor costs and be smart about planning and leveraging storage. He also believes data migrations can be tedious, painful, political and cost the organisation some serious engineering hours.

"So if you get to make this choice early on, choose wisely but also be flexible to adapt and learn as this technology [data migration] is constantly evolving."

Share