Manage your data in the cloud

One tool to protect them all. Your data, that is, whether it's on-premises or in the cloud.

Johannesburg, 28 May 2019
Read time 4min 00sec
Lee Jenkins, Head of Technology, ETS Innovations.
Lee Jenkins, Head of Technology, ETS Innovations.

So, your data is in the cloud. All well and good. Instead of choosing a hybrid model, you've gone all in. Again, all well and good. But the one absolute truth is that cloud storage is really expensive and can spiral out of control really easily. So, how do you ensure you don't have multiple copies of your data stored in the cloud? And how do you permit developers and other authorised parties selective access to copies of your database, while remaining compliant with applicable legislation?

Lee Jenkins, Head of Technology at ETS Innovations, says: "When your data is stored on-premises, you can give developers a self-service snapshot of the database instead of a full copy of what is potentially massive in size. You can permit them a full view of the database, but only authorise them to make changes on a smaller version of the file. You can even mask out personal information, if required."

This is nothing new, if your database is on-premises. It's easy to manage using various tools that work well. However, it's when your database is in the cloud that it becomes really interesting. Jenkins says: "There is currently no way of managing this in the cloud. You can't create an abridged or edited copy of your database. Or monitor what happens to it.

"Not only is it expensive to share a full copy of your database, you might want to protect some of the data. You need to be able to blank out the important information so that the data can be used for testing without contravening data privacy regulations like GDPR and POPI. The question we're faced with is how do you extend this capability into the cloud?"

This is where data copy management comes into play. It's very easy for organisations to end up with multiple copies of their data in various locations, but legislation like GDPR and POPI require businesses to know which data is where and to track it, as well as who has accessed it. Individuals have the right to ask which personal information is being held by a business, and for it to be deleted. This can be difficult, if not impossible, to control if there are assorted versions of your database spread across the organisation.

One solution is to provide a copy, either full or abridged, and ensure it's disposed of once the project is completed. But, you need to administer how that happens. So the requested data is supplied with certain criteria attached and an expiry date. As the data is generally accessed via a network, some network rules are also applied.

Jenkins explains why all of this matters: "When you're in the cloud, storage costs can quickly escalate out of control. This means that you can't keep making copies of databases, you need to be able to turn copies off and destroy what's not being used. Not only will this help control costs, but it also means that you know exactly how many copies are out there."

"Growing adoption of agile DevOps means that changes to data happen on an ongoing basis; we're going to get to a stage where there's no actual release cycle, updates are permanently being released."

While the functionality, as mentioned previously, is available on-premises, some companies still choose to manage copies of their database manually; others have automated it. The next step is to enable this in the cloud. However, cautions Jenkins, companies should never provide automated access to their full database. "Tools must be developed that will request a masked copy of the data that's assigned to a specific person with an expiry date, and charge them for it."

At some point, most organisations will go into the cloud as it's becoming too expensive to keep acquiring on-premises infrastructure. Then this will become a real challenge. Organisations might not even know how many copies of their database are out there, or be able to track them, let alone know what can be deleted. This is why it needs to be automated with an expiry date and then deleted.

Jenkins concludes: "Organisations need to control their data, they need to be accountable to auditors and comply with governance and other criteria. It all comes back to security; you can't just have people requesting access to your database and hand it over anymore."