Forging ahead with governing data, despite COVID-19 national lockdown

By Sizwe Gwala, Data CoE: Data Governance Manager, Engineering Services; Group Services

Johannesburg, 25 Jun 2020
Read time 5min 20sec
Sizwe Gwala
Sizwe Gwala

Data governance and management are essential business practices with little success in corporate circles. Their lack of buy-in can be attributed to factors such as low data literacy among business professionals in general. These practices have and continue to be perceived as expensive practices by technology teams with little or no value to offer businesses. Admittedly, there are few success stories to publicise in demonstrating the value thereof, and as such, more effort is still required from data governance professionals in dispelling these negative connotations and thereby improving overall data literacy in companies.

Data governance is defined by the DAMA Association as: “The exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets.” It refers to formalised structures purposed to provide oversight over data deliverables. Data management, in turn, refers to deliverables undertaken in translating raw data into usable information. The output thereof being the accuracy and trustworthiness of data in addressing a problem statement. These practices have traditionally been seen as IT deliverables, and as such, creating a siloed approach with little, or no, business involvement. However, successfully deriving value therefrom is largely dependent on joint collaborative effort between business and technology teams with clearly defined roles and responsibilities as alluded to in a company’s data governance policy.

The outbreak and incessant intensification of the COVID-19 pandemic has called on companies to activate their respective business continuity management plans, mostly involving virtual collaboration, amid social distancing regulations. Although this was initially perceived as being a temporary move, it has since been confirmed by various scholars that most companies will be operating in this manner for an extended period of time. Of noting here is that success in collaborating virtually is achieved through effective technical tools and equipment; high quality data; and employee competence in operating remotely. The most prominent of the three is high quality data, which is only achieved through effective data management practices. Data quality (DQ) management is defined by the DAMA Association as: “The planning, implementation, and control of activities that apply quality management techniques to data, in order to assure it is fit for consumption and meet the needs of data consumers.” This is concerned with improving the accuracy, validity, integrity, completeness, and trustworthiness of the data to solve a problem statement.

When connecting remotely, employees are heavily dependent on the data at their disposal, which they accept as true and accurate, most often as there are limited avenues for quality verifications. However, in most cases, data tends to have some quality issues, depending on how much a company has previously endorsed its data management practices. When looking at the current data management adoption rates, a possibility exists that incorrect data is currently being used for business reporting and decision-making. This now is an opportune moment for companies to take advantage of the pandemic and forge ahead with remediating their poor data issues under a co-ordinated data management project. These should be agile initiatives led by the central data office with representation from business and technology. In being agile, companies will identify areas with a pressing need for intervention and only focus on data deliverables in a phased approach, with prioritisation given to common datasets among business units.

Once focus areas have been prioritised, the ‘as-is’ DQ snippet will be taken, a report made in all dimensions and submitted to the project team for ratification and delivery backlog sign-off. Once signed off, delivery will be executed in various iterations within two-week sprint cycles. Required stakeholders should form part of the team and ensure that they are readily available to carry out their obligations as required. DQ improvement will be a two-tier approach, firstly remediating existing data as well fixing newly received data at the point of capture, through circulating a set of standards to be observed by all employees when capturing data.

The team will then deploy a code-free self-service DQ improvement service to be used as a DQ rule capturing and amendment portal. With this service, business stakeholders will be encouraged to log all DQ issues they experience while working remotely, which will then be picked up by the project team as focus areas. This channel can further be used as an avenue for initiating any DQ improvement exercise by business data consumers, ie, a consultant can initiate the remediation of an incorrectly spelled surname, which will be ratified by the project team.

The company will furthermore use this opportunity to refresh its external strategic data partnership inventory. With this exercise, the company will first maximise all its internal data sourcing capabilities and in the event that additional data is still required, consultations will be made with external data enrichment organisations with preference given to direct database subscription services. Data categories to be sources externally will be: Name, surname and ID number; address information; financial information; academic qualifications; and criminal records. These external sources will then be connected directly to the master data repository, with business rules built therein and positioned as the most trusted data source.

Although the project will be largely focused on DQ improvement, there will be other data management initiatives being carried out in the process. The project team will also be responsible for capturing metadata consisting of changes made to datasets. The team will also create and maintain a business data glossary, providing explanations of new and existing data terms to be used by data consumers. Finally, the team will also roll-out a mini-data literacy programme in the form of a video, which will be distributed to all employees in a method preferable by the company. The literacy initiative will focus on contemporary issues with practical lessons proposed.

See also