NetApp predicts data will become 'self-aware' in 2018

By Lauren Kate Rawlins, ITWeb digital and innovation contributor.

Johannesburg, 05 Dec 2017

Morne Bekker, NetApp South Africa country manager.

Global data management and cloud storage solutions provider, NetApp has made five predictions for chief technology officers (CTOs) to take note of going into next year, about the changing way data will 'act', move and be stored.

Morne Bekker, NetApp South Africa country manager and district manager for the SADC region addressed a group of media at an event in Rosebank recently.

He said the company celebrated its 25^th year of being in business this year, and has gone through a massive transformation and adjusted its strategy around cloud. The result of this is that its share price has risen from $30 -$32 per share a year ago, to just over $52 now.

The predictions were drawn out of NetApp's own experience and trends they have noticed developing.

Data becomes self-aware

The first prediction is that data will start to define itself. Bekker says that at the moment there are processes that act on data and determine how it's moved, managed and protected.

But, he says "As data becomes self-aware and even more diverse than it is today, the metadata will make it possible for the data to proactively transport, categorise, analyse and protect itself. The flow between data, applications and storage elements will be mapped in real time as the data delivers the exact information a user needs at the exact time they need it.

"This also introduces the ability for data to self-govern. The data itself will determine who has the right to access, share and use it, which could have wider implications for external data protection, privacy, governance and sovereignty."

He gave a real-world example of how this would work by saying that if a person is in a car accident, there may be a number of different groups that want access to data from the car - such as an insurance company to determine liability or a motoring company to improve systems.

If the data is self-aware than it can be tagged so it controls who sees what parts of it and when, without additional time consuming and potentially error prone human intervention to subdivide, approve and disseminate the data.

Virtual machines become "rideshare" machines

Bekker says it will soon be faster, cheaper and more convenient to manage increasingly distributed data using virtual machines, provisioned on webscale (server-less computing) infrastructure, than it will be on real machines.

"This can be thought of in terms of buying a car versus leasing one or using a rideshare service like Uber. If you are someone that hauls heavy loads every day, it would make sense for you to buy a truck. However, someone else may only need a certain kind of vehicle for a set period of time, making it more practical to lease. And then, there are those who only need a vehicle to get them from point A to point B, one time only: the type of vehicle doesn't matter, just speed and convenience, so a rideshare service the best option," says Bekker.

"This same thinking applies in the context of virtual versus physical machine instances. Custom hardware can be expensive, but for consistent, intensive workloads, it might make more sense to invest in the physical infrastructure. A virtual machine instance in the cloud supporting variable workloads would be like leasing: users can access the virtual machine without owning it or needing to know any details about it. And, at the end of the 'lease,' it's gone."

Data will grow faster than the ability to transport it... and that's OK!

The third prediction touched on how fast data is increasing in size.

"It's no secret that data has become incredibly dynamic and is being generated at an unprecedented rate that will greatly exceed the ability to transport it. However, instead of moving the data, the applications and resources needed to process it will be moved to the data and that has implications for new architectures like edge, core, and cloud.

"In the future, the amount of data ingested in the core will always be less than the amount generated at the edge, but this won't happen by accident. It must be enabled very deliberately to ensure that the right data is being retained for later decision making," says Bekker.

He says an example would be the sensors on autonomous cars that will generate so much data that there's no network fast enough between the car and data centres to move it.

"Historically, devices at the edge haven't created a lot of data, but now with sensors in everything from cars to thermostats to wearables, edge data is growing so fast it will exceed the capacity of the network connections to the core. Autonomous cars and other edge devices require real-time analysis at the edge in order to make critical in-the-moment decisions. As a result, we will move the applications to the data."

Evolving from 'Big Data' to 'Huge Data' will demand new solid state-driven architectures

The amount of data that is going to be generated in the future will need new technology to handle it.

"As the demand to analyse enormous sets of data ever more rapidly increases, we need to move the data closer to the compute resource. Persistent memory is what will allow ultra-low latency computing without data loss; and these latency demands will finally force software architectures to change and create new data driven opportunities for businesses," says Bekker.

"Flash technology has been a hot topic in the industry, however, the software being run on it didn't really change, it just got faster. This is being driven by the evolution of IT's role in an organisation. In the past, IT's primary function would have been to automate and optimise processes like ordering, billing, accounts receivable and others. Today, IT is integral to enriching customer relationships by offering always-on services, mobile apps and rich web experiences."

He says the next step will be to monetise the data being collected through various sensors and devices to create new business opportunities and it's this step that will require new application architectures supported by technology like persistent memory.

Emergence of decentralised immutable mechanisms for managing data

The last prediction incorporates one of the hottest buzzwords of the year: blockchain.

Bekker says mechanisms to manage data in a trustworthy, immutable and truly distributed way will emerge and have a profound impact on the data centre. He says blockchain is a prime example of this.

"Decentralised mechanisms like blockchain challenge the traditional sense of data protection and management. Because there is no central point of control, such as a centralised server, it is impossible to change or delete information contained on a blockchain and all transactions are irreversible."

He says it can be likened to a biological system: "You have a host of small organisms and they each know what they're supposed to do without having to communicate with anything else or be told what to do. Then you throw in a bunch of nutrients: in this case, data. The nutrients know what to do and it all starts operating in a cooperative manner, without any central control. Like a coral reef."

Bekker says current data centres and applications operate like commercially managed farms, with a central point of control (the farmer) managing the surrounding environment.

"The decentralised immutable mechanisms for managing data will offer microservices that the data can use to perform necessary functions. The microservices and data will work cooperatively, without overall centrally managed control."