TransUnion's director of analytics to share ML case study
Building a data science environment or team is hard, and organisations must be careful of biting off more than they can chew.
So says Frans Potgieter, director of analytics at TransUnion Credit Bureau, who will be presenting a case study on 'Assessing the enablers to an effective machine learning (ML) model’, at ITWeb Artificial Intelligence 2019, to be held on 20 August, at The Forum in Bryanston.
He advises businesses to start with proven, achievable approaches and build from there.
“Don’t let perfect be the enemy of good. ML algorithms can provide value even if they’re not the core solution. For example, they can be employed in interim steps in the model development process like variable reduction, or reject inference modelling.
“We had one customer who was trying to build distinct models for hundreds of portfolio segments, and retrain them every month. This customer was unable to get off the ground because the approach was unrealistic.”
ML algorithms can reduce the number of hours and human resources in the modelling process, but require knowledgeable data scientists to employ it correctly, he notes. However, ML models that are poorly trained can mean that logistic regression, i.e. predictive analysis, won’t perform properly.
Hyper-parameter tuning is the process of selecting a set of optimal hyper-parameters for any given learning algorithm. Potgieter says hyper-parameter tuning is an important, nuanced, and computationally intensive step. It isn’t easy to do, but it must be done properly to avoid underfitting, which means using a statistical model that has too few parameters relative to the size of the sample.
In his opinion, good data is more important than better algorithms. and he advises organisations to focus on getting the data foundation right and then work on building models on top of that foundation.
There are several enablers to an effective ML model, according to Potgieter. These are: good data and a solid understanding of the data; an in-depth comprehension of the ML algorithms used, including how to tune hyper-parameters, how to document resulting solutions, and how to implement resulting solutions. In addition, he cites a tested method of implementing solutions at scale; compliance and business review; and, finally, a data science environment and team.
Delegates attending Potgieter's talk will learn how to evaluate algorithms according to key dimensions before deciding to use a new ML algorithm.