Data quality framework: Necessity or discretionary practice?
Companies that have a functional data quality framework are more likely to be at an advanced maturity level with regards to analytical capabilities.
Three decades ago, organisations could probably get away with not having an abundance of clean, reliable and accurate data, and driving the business purely from an operational perspective.
The adoption of business intelligence and analytics, however, has revolutionised ways of working in the past few decades. Organisations now need to be more tactical and strategic, especially in saturated markets, where customer centricity, satisfaction and experience are the core drivers for customer acquisition and customer retention today.
Let’s have a look at a practical example: Telecommunications service providers can no longer assume that customers will want to recharge as soon as their airtime is depleted, after making a call. The caller may be more of a data user, with a smartphone and currently in a location that has free WiFi. The receiver may also have a smartphone but no data and not in a free WiFi zone. So, how does the service provider collect and combine this information in order to offer the receiver a data package that will enable the caller and receiver to continue conversing?
If you are thinking along the lines of data engineering pipelines, streaming analytics, analytical models and actionable intelligence, you are on the right track! However, apart from the heavy lifting data engineering and data science practices, there is a fundamental component that is often overlooked: data quality.
In the above-mentioned use case, the service provider makes use of a combination of internal (subscriber behaviour “data user” and location) and external (handset data “smartphone” and WiFi zone) data sources. If any of the data points needed to decide on the next best offer are incorrect or data is not readily available, the service provider might not be able to capitalise on the available opportunity.
Data-driven organisations thrive on using data and analytics to gain a competitive-edge.
A data quality framework is an industry-developed best practice guide for monitoring, maintaining and maturing the quality of data a business has. A data quality framework spans across the entire data lifecycle from data creation, data acquisition, data transformation, data preservation and data access/use. The aim is to continuously improve the quality of data by measuring, monitoring and controlling/resolving identified data issues.
A data quality framework is comprised of many data quality dimensions, namely (to table a few):
- Accuracy – is the data correctly representing what transpired?
- Currency – are the values up to date?
- Completeness – are all the required fields populated? Any missing data records?
- Consistency – are values across datasets consistent?
- Conformity – do the values conform to the expected format?
- Privacy – are the necessary controls implemented for access management and usage monitoring?
- Reasonableness – is the data sensible within an operational context?
- Referential integrity – are the relationships between objects/entities intact?
- Uniqueness – are there any duplicated records?
- Timelines – does the age of data meet the user requirement in terms of availability and accessibility?
Data quality dimensions roll up to a data quality index which is what organisations use to provide them with an enterprise view of the overall data quality. Robust data quality framework engines allow organisations to control which data elements are critical to the organisation and based on defined thresholds the data can be pushed into the “system” for further processing while raising an alert to caution the end-users or third-party tool of the data quality issue that has been identified. The record can also be rejected and stored in an error bucket, depending on the criticality of the data element that is not in line with the agreed standard of quality.
Implementing a successful data quality framework entails:
- Understanding the organisation’s data and its intended use.
- Identifying key attributes or data elements that are critical for the organisation.
- Understanding how the quality impacts analysis and decision-making for data consumers.
- Identifying data owners and stewardships.
- Collaboration between data owners and data consumers regarding the critical success factors.
- Defining service level agreements between IT and data owners, data consumers on resolution times for identified data issues based on agreed categories and severity.
- Implementing a data quality management tool that can inspect the data at various points in the data life cycle, across departments or subject areas.
- Ensuring the data quality management tool can report (in the form of dashboards), alert and integrate with the organisation’s incident/problem management tool, to automatically raise incidents to the appropriate teams when a data issue is identified.
- Finally, a culture that cultivates collaboration and a consensual strategic vision throughout the organisation that data as an asset is pivotal to the company’s success.
The benefits of implementing a successful data quality framework include, but are not limited to, accurate data available for strategic and critical business decisions, reduction of operational risks, increased customer loyalty, enhanced marketing intelligence that reduces marketing costs by targeting only those customers that need to be targeted and builds trust within the organisation so that internal users don’t seek or build data stores in silos (outside of the framework).
Data-driven organisations thrive on using data and analytics to gain a competitive-edge. The quality of data used in decision-making or analytical (descriptive, predictive, prescriptive, etc) models must be of the quality that is relative to its intended use. Data quality has a significant impact on the analytics maturity level of an organisation. Organisations that have an established and functional data quality framework are more likely to be at an advanced maturity level with regards to analytical capabilities than organisations that have a non-established data quality framework. Data quality is one of the criteria measured on the TDWI Analytics Maturity Assessment Model.
Viewing a data quality framework as a discretionary practice could be counterproductive to your organisation if your strategy is leaning towards being a data-driven company.
Principal consultant at PBT Group.
Windsor Gumede is principal consultant at PBT Group. He is a self-motivated, results-driven principal BI consultant with 10 years’ experience in data and analytics. Gumede has worked on numerous data and analytics projects in Africa and the Middle East. Throughout his career, he has played different roles, from ETL/ELT development, to data modelling, front-end development, solution architecture and design, to pre-sales consulting. The majority of his experience comes from the telecommunications industry, but he is currently maturing his knowledge in the insurance space using big data technologies to help insurance clients comply with regulatory requirements.
Windsor Gumede is principal consultant at PBT Group.
He is a self-motivated, results-driven principal BI consultant with 10 years’ experience in data and analytics. Gumede has worked on numerous data and analytics projects in Africa and the Middle East.
Throughout his career, he has played different roles, from ETL/ELT development, to data modelling, front-end development, solution architecture and design, to pre-sales consulting. The majority of his experience comes from the telecommunications industry, but he is currently maturing his knowledge in the insurance space using big data technologies to help insurance clients comply with regulatory requirements.Gumede is a strong believer in the core fundamentals of enterprise data management. “I see a huge gap in South Africa with technical resources that have skills in the big data engineering field but don’t have the proper grounding on enterprise data management principles. Skills on tools and technology without the literature is ineffectual.”