Fishing for facts
Getting qualified answers from a vast, unstructured data lake demands a blended data approach to analytics.
As data proliferates in a multitude of formats, it is becoming increasingly difficult to harness that data in a way that is truly useful to business.
The beauty of the always-on, all-connected world is the massive volumes of data being generated all the time, with the potential to inform more accurate, forward-looking business decision-making.
These zettabytes of high velocity, complex and variable data in circulation promise to hold game-changing insights, and every forward-looking company is looking for solutions to help them harness and monetise this. Forrester expects the global big data management solutions market to grow at a compound annual growth rate (CAGR) of 12.8% over the 2016 to 2021 period, while IDC forecasts a 50% increase in revenue from the sale of big data and business analytics software, hardware and services between 2015 and 2019.
Overall, Gartner puts the worldwide business intelligence (BI) and analytics market at $16.9 billion this year. Clearly, companies are taking the potential of their data seriously.
But, by its very nature, the big data that holds undiscovered trends, patterns and insights is ever-changing, and in order to act on it when it is relevant, the business must move quickly.
Mix it up
This means in many areas of the business, there is no time for the BI approach to analytics, whereby data is qualified and information verified before reports can be compiled and decisions made. Now, acting on insights from real-time data demands a different approach. This is why there is a resurgence of the blended data approach to verifying information.
Data blending brings together different versions of the truth from a multitude of sources in a variety of formats. These might include spreadsheets, social media feeds, news, external research, photos, video and audio. Where the data overlaps and corresponds, there is a high probability it is accurate and it can therefore be used to support decision-making, even if it has not undergone traditional research and quality control processes.
Data blending is becoming more important for businesses needing to take a cross-informed approach to strategic decision-making. They are looking to what amounts to mashups of data, blended for a distinct purpose, in order to quickly come up with answers that are highly probable.
In its simplest form, data blending might be an exercise undertaken by an individual who sees one piece of information, but cross-checks it against a variety of sources to determine whether that information is likely to be true. Another example would be a bank using more than one source - including its own records and a confirmation call to a customer - to verify that a particular transaction has taken place. This blended approach is not necessarily a new one, but in the past, there were limited sources of additional data available; now, the Internet, social media and Internet of things (IOT) devices produce billions of inputs to qualify the answers.
In a big data environment, cross-checking and comparing data with information from a multitude of sources becomes more complex. It must be driven by data scientists who are equipped to ask informed 'what if' questions and create analytics models to explore all potential scenarios. Big data analytics looks at undetermined patterns, and the 'what if' analysis is qualified by blended big data. Therefore, even if the data has not gone through traditional quality controls, a potential truth now has any number of pieces of information to support it, which gives decision-makers a measure of confidence in its accuracy.
Data blending brings together different versions of the truth from a multitude of sources in a variety of formats.
This blended data approach is again coming to the fore, thanks to advances in big data analytics technology, unprecedented levels of processing power, the scalability of enterprise systems and the information explosion, which gives businesses access to billions of data sources to better support their fact-finding.
Companies that are not yet doing so should consider this approach it they fall in the ambit of big data analytics.
For qualified answers quickly, companies need to blend their data and match the probabilities of different references coming into their research. The traditional approach, whereby data must be qualified and information verified, incurs additional overheads and quite simply wastes time.
With blending, which does not necessarily have these overheads and time constraints, the complementary pieces of data can be cross-checked on the fly. This approach is faster and allows for agile and cross-informed decision-making with a high probability of accuracy.
Data blending is not a silver bullet for all analytics-based decision-making and reporting, but it does present a viable way for business to seek quick insights to support agile responses in a rapidly changing environment.
Mervyn Mooi is a director of Knowledge Integration Dynamics (KID), and also a key resource within the company's information management, data warehousing and business intelligence teams. He has been in the IT industry for 36 years, beginning his career as an operator at the CICS bureau in Johannesburg in the early 1980s. Thereafter, he was appointed as a programmer at state-owned oil exploration and production company SOEKOR. In 1986, Mooi joined Anglo American's head office IT department where he remained for almost 12 years. Here he progressed to become a senior programmer, analyst, database administrator and technical support specialist. After completing his degree in informatics, he then left to join Software Futures, where he worked as a senior consultant for 18 months in the data warehousing and business intelligence arena. Mooi joined KID in 1999 as a data warehouse and business intelligence specialist. Mooi's experience in ICT disciplines includes operations, business and systems analysis, application development, database administration, data governance/management, data architecture/modelling, production application and systems software support, data warehousing and business intelligence. He now focuses on enterprise information management, information governance and cloud solutions.