Taming big data analytics

Read time 8min 50sec
Gary Allenman, Master Data Management.
Gary Allenman, Master Data Management.

Every day, businesses are creating floods of data, drowning the enterprise with vast, diverse sets of information, from multiple sources. Dubbed the 'three Vs' of big data, the volume of information, the velocity or speed at which it is created and collected, and the variety of the data points give us an understanding of how it can be measured. But while big data offers big insights, it also poses major challenges. One such challenge is finding the time and resources needed to collect and analyse it. Moreover, many find the sheer volume of data too overwhelming for the organisation to properly leverage it. As a result, many organisations aren’t making the best use of this valuable resource. This is where big data analytics come in, tools that helps businesses control this data and use it to gain actionable insights to identify new opportunities, as well as better serve customers, which means more efficient operations, a fatter bottom line, and happy customers.

Design a data strategy

TechSoft International’s Tony Nkuna offers some tips for building a successful data strategy.

  • An effective data strategy relies on quality – analytics will never be tamed if the data is flawed or your quantities are uncontrollable.
  • Data capture is critical – business data must be collected and funnelled.
  • Understand the data – big data is merely a raw material that can be transformed. It can be modelled using analytics or through data science.
  • Anticipate – understand what you want from the data to manipulate the insights you need.
  • Use the data to help you decide – your data strategy must dovetail with your desire to derive real-time decision-making.
  • Act fast – use automation techniques appropriate for teams to get the data needed to fuel your big data analytics.
Monitor everything – the data process is a continual one, so regularly monitor and refine your strategy to ensure big data analytics is being used effectively.

Tony Nkuna, senior consultant and integration specialist, TechSoft International, says the benefits of harnessing big data analytics are vast. This means increasing competitive advantage and helping create a holistic view of the business through a data-rich, consistent, and comprehensive view. “It’s the process by which you access the information needed to base decisions on accurate, timely data instead of gut instincts.”

Another benefit is faster time to action, as big data analytics pave the way to anticipating situations and opportunities, helping business owners ask relevant and timely questions, and getting the answers they need to take decisive action. It can also help businesses discover unseen or hidden trends and patterns in large, complex data sets, which helps them identify strategic opportunities, and risks.

Big data is also the primary source and starting point for analytics, AI and machine learning (ML) solutions, adds Henry Adams, country manager at InterSystems SA. Without the data, they’re simply tools that work in a vacuum. Competitive businesses know that to set them apart from other offerings in a market where competition is fierce, and the pace of innovation fast and unrelenting, they need to tame the analytics beast. In effect, the adoption of AI, ML and big data analytics is the price of admission to the main event, which is the future of business – but this all starts with data.

However, despite the obvious benefits, many businesses don’t have a clue where to start on a big data analytics journey. Leveraging data goes further than implementing tools. They need to understand what data is important to measure, and how they can use that information to drive improved decision-making.

Ultimately, big data analytics means bringing together data from multiple disparate sources, structured and unstructured.

Tony Nkuna, TechSoft International

Any data strategy needs to support outcomes determined by the business strategy, so before you even start with a data strategy, make sure there’s a clear business strategy in place, says Jacques du Preez, CEO, Intellinexus. “Some organisations fall into the trap of becoming data hoarders and collect any and all data, but if you have 20 petabytes of data, of which 19 petabytes is irrelevant to your business outcomes, it creates inefficiencies. An effective data strategy should focus on collecting only the data that is essential and informative to what the business is trying to achieve. Organisations need to ask: what data do we need to define our operating model, understand our customer base, and produce insights about the business? What types of data do we need, and how often, and in what format? The response to each of these questions needs to be guided by the business strategy.”

Get your house in order

“Sort out your data first,” agrees Adams. Data quality remains the biggest challenge facing businesses looking to harness the power of analytics, AI and ML solutions. Without fast and easy access to the right kind of data, the deployment of AI and analytics will fail to deliver the transformation promised by these technologies. Any big data projects will fail to take off. Next, factor in data quantity. Effective analytics requires masses of data, but data volumes aren't a problem to most businesses, and according to reports, the world has hit 59 zettabytes of data and is on course to hit 150 zettabytes in five years. The difficulty is in handling the volume – to use and integrate it to provide the results an organisation needs. So check what you need before you throw it all into the mix. Thirdly, agility is essential. Data is amorphous, and it grows and expands as your business changes. In the case of retail customers, this can be seasonal or influenced by promotions, so ensuring agility will help organisations handle spikes caused by real-time market changes or surges in transactions that generate increased streaming levels. You might say the data is like water. Too much, you drown, too little, you thirst. When it’s old, stagnant or dirty, it can harm you. It’s the fresh, clean, healthy data that you need to thrive.

So what are the pain points to avoid? According to Gary Alleman, MD, Master Data Management, chaos, crisis and confusion are the pain points to avoid.

Every decision-maker should be able to find the data they need to do their jobs.

Gary Alleman, Master Data Management

“The volume and velocity of change, not just of data, but of supporting technologies, can be overwhelming. Companies must ensure that data is accessible; this means ensuring that business users and data scientists label the data based on its use and trustworthiness. It’s crucial for companies to understand that big data should not be something that is accessible to a select few such as the business intelligence team or data scientist. Every decision-maker should be able to find the data they need to do their jobs.”

The incorrect application of technology is another common obstacle to effective big data strategies, adds Du Preez. It doesn’t help to invest in the latest technology if that technology doesn’t support and enable the business strategy. A classic example of tech being used where it shouldn’t is when organisations use business planning and consolidation (BPC) software as a data entry tool instead of purely as a planning application. Trying to integrate that data into processes can be very error-prone. And since BPC tends to have the option for manual data input, there’s less validation of data and the quality can drop. Once that happens, it can undermine the overall data strategy, as it’s impossible to make good decisions with poor-quality data. Secondly, it’s vital that there’s a single copy of data. Many solutions replicate data that can create syncing issues and make it nearly impossible to find the single source of truth. Copying and moving large data sets also takes time, which means you are slower at generating insights to guide your decisions, which undermines your agility.

Speed of development

Another fundamental challenge is complexity, says Alleman. “We talk about volume, variety and velocity when defining big data. The technology supporting big data projects is also changing rapidly – for example, moving away from traditional relational database environments towards unstructured and new cloud data sources such as Snowflake, a cloud data platform with Data Warehouse-as-a-Service (DWaaS) and cloud data lake providing a cloud-based single solution to big data management needs. Another data source is Databricks, an open and unified data analytics platform for data engineering, data science, machine learning, and analytics. Companies need to view their architecture in a holistic manner as new data sources and technology will emerge and need to be integrated with existing environments. It’s important not to be tied into a particular technology stack or platform as technology changes so rapidly.

InterSystems SA’s Henry Adams believes you need to take a pragmatic and systematic approach to big data.
  • If you try to conduct a master data management strategy across all of your data in one go, it will be like eating an elephant in one sitting.
  • Ensure you get business buy-in or executive sponsorship – all the cogs in your engine need to know what the power that analytics will bring them, and work together to ensure that the data strategy is adhered to.
  • Good data discipline is essential for teams to stick to, so agree on the data policies and implement them.
  • Legacy systems will remain part of your pain. Some systems will never speak the same language, so work on weaving the data into a fabric that does.
  • Consolidation and standardisation should be key to your data strategy. Without them, any dreams of harnessing data to perform big data analytics won't make it out of the starting blocks. 

A real challenge is the speed of development, and this is where governance comes into play both on business and technology levels. The elephant in the room, of course, is the Protection of Personal Information Act. We want to make data accessible and deliver a better customer experience, but we need to protect the data. Data governance and data protection polices need to balance these two competing requirements.”

When it comes to identifying the most critical data sources that need to be measured, Du Preez says relevance and applicability determine the critical data sources. “Any data source containing data that can be used to measure the status and progress of your business strategy is relevant. Don’t be a data hoarder – instead, build for purpose by seeking out data sources that are necessary and supportive of achieving business outcomes. This data could be from core systems, or could be enrichment data from external sources such as social media, weather systems, or financial markets.”

This all depends on the nature of your business, says Nkuna. “A lab in KwaZulu-Natal tracking the genomic sequences of the Covid-19 virus requires a very different data source than a logistics company hauling goods from the Durban harbour to Joburg. Ultimately, big data analytics means bringing together data from multiple disparate sources, structured and unstructured. The importance of the source is linked to the nature of the question you’re asking of your data. You can’t expect to map customer satisfaction from only analysing delivery notes.”

Don’t be a data hoarder – instead, build for purpose by seeking out data sources that are necessary and supportive of achieving business outcomes.

Jacques du Preez, Intellinexus

Know your business, says Adams. “Understand what you want from your analytics. Then write that into your data management policies. Data for the sake of data won't help your analytics journey. It will just leave your data scientists and analysts having to rework and rewrite data models every time they want to build out a new application or look for insights from your data.”

See also