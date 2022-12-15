If you were building a data model tomorrow, would you use pre-pandemic data? In some cases, those data are still applicable. But in many cases, COVID has changed so much about our lives and world that using pre-COVID data would be foolish.

Take, for example, a model whose goal is to identify warning signs for mental health concerns. Given that those warning signs have changed since the onset of COVID, using old, stale data may not provide the most accurate model.

That's one real-life example of how important it is to build artificial intelligence (AI) models with relevant, timely data.

A key consideration would be to implement ethical AI practices and use accurate models that have been thoroughly tested in order to help avoid costly mistakes and life-changing impacts.

In my first column on this topic, I spoke about AI bias. CIO magazine reveals that bias has become so inherent in AI models that companies are bringing in a new C-level executive − the chief ethics officer − who is tasked with navigating the ethical implications of AI. Salesforce, Airbnb and Fidelity are cited as corporates that have ethics officers in place, with more enterprises expected to follow suit.

Forrester research notes it's abundantly clear that if we are not extremely careful, AI will identify and exploit harmful biases in training data. AI-based discrimination − even if it's unintentional − can have dire regulatory, reputational and revenue impacts.

Forrester highlights that knowing this, many companies, governments and NGOs have adopted 'fairness' as a core principle for ethical AI. But they caution that espousing this principle is one thing, and turning it into a set of consistently practised, enterprise-wide policies and processes is quite another.

Training AI involves inculcating the right habits to achieve the desired outcomes.

The Forrester report says the first issue plaguing companies is that there are over 20 different mathematical representations of fairness. Broadly speaking, AI fairness criteria fall into two camps:

Accuracy-based criteria that optimise equality of treatment. These criteria compare different measurements of a model's accuracy, such as the true positive or false positive rate.

Representation-based criteria that optimise equity of outcome. These metrics theoretically correct for historical inequities by ensuring equitable outcomes across groups.

According to Forrester, the burning question is which criteria should a company employ for a given use case.

AI needs to be trained if it is to produce a model that inherently has good habits and processes, and that will deliver the desired outcome − in an ethical manner. Training AI involves inculcating the right habits to achieve the desired outcomes.

The following are recommendations that aim to avoid bad habits in AI models:

Training AI models relies on data sets, models and algorithms, so the first thing to examine is data bias and how to avoid it.

Eliminating data bias or removing as much of it as possible is crucial when building and incorporating AI methodologies. Humans play a critical role in training AI through setting data parameters and filtering them.

It is important to assess AI parameters to ensure technologists building the AI algorithm are not potentially introducing unconscious bias into the data process.

People create AI models, feed them, train them and ultimately interpret the data − all these actions may be influenced by personal beliefs, familial upbringing, demographics or environmental factors.

To avoid this bad habit, involve other subject matter experts and those ingrained in the processes from the beginning, who can provide feedback and alternate ways of thinking that can eliminate unconscious bias.

Models are becoming increasingly complex with thousands, if not hundreds of thousands, of variables, many of which may introduce bias in non-obvious ways. This is why the practice of ethical AI is so important in implementing organisational data models.

The next important thing to do is to avoid using stale data. As I noted earlier, it is critical to use relevant data to create and train technology, so the interpretation of that data can generate the best possible outcome.

The data used must be timely to be relevant and useful. If data is old or irrelevant, the desired outcome will not be achieved and may reflect ineffective or useless results.

Finally, reinforce using an adequate sample size − repetition is key. In other words, a large sample size of data and results is needed. One key difference where humans excel is that they can recognise the context of a decision and apply that context in new scenarios much more easily (or at least faster) than current algorithms.

When AI is trained adequately, and sufficient data is used in the exercise, the resultant models will yield the best results.

At the risk of repetition, I cannot emphasise enough that what is needed is to strive to build a model that incorporates the latest, most relevant data, which has been stripped of as much bias as possible. Only in this way will a company achieve outcomes that best suit and support its business needs.