The dawn of data science

Read time 8min 40sec

For the great explorers of the past, navigation without a compass would have been unthinkable. Today, not only are we able to determine in which direction we're headed, but we have a wealth of other useful information at our fingertips. From location co-ordinates, to weather conditions, terrain, traffic congestion, crime hotspots and toll roads, we're able to plan our journeys down to the smallest details. But, unless you have the necessary ways and means of managing and interpreting this information overload, you will still be lost.

There's no art in good data science, just inquisitiveness, the determination to solve a problem, a solid understanding of scientific principles...

Francois Swanepoel, CTO, Stone Three

The same can be said for businesses, says Armand`e Kruger, regional sales director at PBT Group. "The business that has the correct information first and has the best ability to derive value from it and act on it will be the one that outperforms its rivals." Kruger believes business owners with money to invest this year should be spending that extra cash on big data analytics.

A company's data is one of its most important assets, and yet data management remains a headache for most, says Jason Barr, divisional manager of storage and availability at XON. With the exponential growth of data, IT departments often struggle to come up with an effective data management strategy, especially one that is secure, reliable and cost-effective. One of the key steps, notes Barr, is to avoid operating in silos because this hinders a business' ability to create a holistic approach to management. Data management should ideally span from inception to expiration. The goal must be to close the gap between the business and IT systems, he continues.

According to Francois Swanepoel, CTO of software engineering firm Stone Three, effective data management has become a science, which requires the organisation to ask the right questions in order to reap the desired rewards. He describes data science as various techniques and procedures used to analyse, interpret and visualise large amounts of information. While this may sound quite complex, he stresses that it all boils down to using the right methods to extract useful insights from your data. "There's no art in good data science, just inquisitiveness, the determination to solve a problem, a solid understanding of scientific principles, and the commercial astuteness to turn the insights gained from patterns and probabilities into tangible business benefits." Swanepoel believes the 'trick' lies in getting the data to talk, to give up its secrets and reveal something your competitors don't know. It can also inspire you to develop a new product or service that you would never otherwise have considered, he adds.

Risk versus reward

With the importance of data well established, organisations now need to consider where to store all of this valuable information. For Barr, when it comes to hosting, businesses need to approach the issue on a case-by-case basis. Non-critical data can be hosted offshore, while important applications and data, which require reliable, real-time processing, would perform better in a locally hosted environment.

"The question is one of risk versus reward, a choice that each and every organisation will have to weigh up and account for," says Barr. When making this decision, organisations should consider everything from the skill level of the business' employees to the unique needs of the organisation and the budget at their disposal. For many businesses, the cloud is viewed as a more affordable data management strategy, especially when compared with the cost and manpower required to host large stores of data in their own environment. But Barr acknowledges that having information dispersed in many different places can make it difficult to have a single, comprehensive view of that data.

Compliance should also be a critical consideration of any data-hosting discussion, notes Barr. The Protection of Personal Information Act (POPI) is a data management game-changer. As legislation around the location of a company's data becomes more stringent, things like where different types of information are allowed to be stored geographically will have to be considered, for example.

Reliability is the most important factor for data management, as the point is to make sure data is accessible and recoverable.

Warren Olivier, regional manager for southern Africa, Veeam

While going local is accepted as the best option for long-term security, both Barr and Warren Olivier, regional manager for southern Africa at Veeam, cite the Eskom electricity crisis as a growing concern and a threat to reliability. According to Olivier, load shedding on 6 February took down an Internet Solutions data centre in Gauteng. When the ISP's generators failed to kick in, their services were affected and clients were instructed to shut down their equipment. While he accepts that cost is commonly a determining factor for many businesses, particularly start-ups, companies really need to be thinking about reliability. "Reliability is the most important factor for data management, as the point is to make sure data is accessible and recoverable," Olivier notes, with reference to problems presented by the power crisis.

In 2015, the data management discussion will focus on increasing data volumes, complexity, compliance, security and accessibility. But, even in its most refined form, data still needs to be analysed, interpreted and acted on, which can be achieved through a suitable hosting strategy and sound data science.

A question of security

Conversations around data security often conjure up images of sneaky hackers using their computer savvy to access confidential governmental, financial or business information. But data security entails so much more than protecting sensitive information from the prying eyes of criminals or one's competitors.

"Data security is not about just stopping hackers. If the data is lost, then there's nothing for them to penetrate anyway," says Chris Ogden, MD of RubiBlue. "Data security is a holistic approach, one that needs to be embraced from all angles." Ogden is calling on organisations to shift their focus away from the hardware and software layers that have been put in place to guard against security breaches.

He believes data security encompasses backing up the information, ensuring accessibility when the data is required, safeguarding against files being corrupted, data recovery and complying with regulations. Most often, organisations cut corners in an attempt to cut costs, but Ogden says not implementing a comprehensive enough security strategy can actually be more costly in the long run. He cites the recent hacking of Sony Pictures Entertainment, which saw the public release of confidential company data, as an example of the kind of embarrassing reputational damage that a business can experience should it not have adequate procedures in place to secure its data.

When you consider how valuable data is to your organisation and the revenue loss that can occur as a result of downtime, selecting a suitable hosting partner should not be taken lightly, because you are essentially handing over potential profits to an outside party. You are asking this outside party to secure this asset and make it available when you require it. Ogden challenges the assumption that it's more plausible to infiltrate a system hosted on-premises than it is to access one hosted in the cloud, highlighting that unless the business is running in a completely offline environment, with no access points in or out of the network (a very rare real-world situation), both scenarios are susceptible to unauthorised access and can fall victim to the same data security risks.

"There are many data security vulnerabilities," notes Ogden. "We as IT solutions providers try to ensure we cover the basics and ensure that if something does happen, we can easily recover with minimal risk to consumers, or to our internal data."

The art of analytics

During the 2014 Football World Cup, the German team employed a secret weapon - big data. With input from team management, university students and software corporation SAP, they were able to track and analyse the movements and performance of each player. Described as the team's '12th man on the pitch', the match insights proved valuable in their much-hyped defeat of host nation Brazil and their ultimate success at the event.

When things go wrong

* Unplanned IT systems downtime could be costing South African businesses as much as 2% of their annual profit.
* Globally, enterprises have unplanned downtime an average of 13 times a year, for up to 51 hours.
* Depending on whether the downtime affects mission-critical apps or not, that adds up over a year to an average cost of between $1.4 million and $2.2 million in lost revenue, decreased productivity and missed opportunities.
* Currently, global businesses only test 5.26% of their backups per quarter, resulting in a lack of verification, and one in every six recoveries failing.

Courtesy of Warren Olivier, regional manager for southern Africa at Veeam

For Waleed Zohdy, technical director at TA Telecom, not only can these data-driven solutions and strategies revolutionise things like sports, they can also be utilised to transform entire industries like transport, telecoms or healthcare. There's so much potential for this kind of innovation to be used by normal businesses to provide a more comprehensive view of the organisation and its customers. "The more data we collect and analyse, the more patterns we will discover and the more solutions we will find," Zohdy says. "The more you dig into data, the more valuable and clear the insights you will be able to retrieve." This involves expanding on the traditional notion of only analysing clean or structured data and looking to unstructured or semantic data, which is commonly derived from seemingly unconventional data sources.

Zohdy advises that organisations collect and invest in all types of data - financial, psychological, social and geographical. If this information is relevant to their products, customers and markets, it can become a resource to develop better business strategies and ideally result in revenue growth. This makes forecasting, planning for data growth and diversifying data sources essential. But, raw data sources are awash with inconsistencies and gaps and need to be augmented into meaningful business information. This kind of useful information is curated as a by-product of data analytics. Analytical models should be developed to provide a 360-degree view of the information at the point of data collection, he says, as this will enable the organisation to quickly and seamlessly capitalise on data-driven insights.

See also