It sounds like a simple equation. Banks need customers, and in order to understand those customers better, they need data. And in order to analyse data and run their systems, they need the cloud. In fact, cloud is now the ticket to the game in financial services, which allows these companies to run services they would find almost impossible to do in their own datacentres. Financial services businesses are now among the largest users of cloud services, but as AWS CEO Matt Garman says, it wasn’t always this way.
As a fledgling company, he remembers AWS executives paying a visit to New York to visit some of the banks, which had expressed interest in “what this whole cloud computing thing was about”.
“We sat down with them, and outlined our vision about how cloud computing could change how they run their IT and technology. They told us that it was unlikely that any of their production workloads were ever going to run in the cloud. They were very diligent, and gave us a whole list of reasons; there’s compliance [they said], there’s audit, there’s regulatory, there’s security, there’s encryption. They said they’d love to [move to the cloud] and it was compelling technology, but t hey were probably never going to run in the cloud.”
Choosng the right mix isn't about public vs private. It's about the right workload and the right environment to get the right outcome.
Richard Vester, iOCO
Garman says it would have been easy for AWS to turn away and concentrate on other customers, but it then spent the better part of the next decade working on the concerns of those banks in New York, many of which have since become customers.
But while most banks in the world are now using cloud services, there are differing levels of adoption, particularly if the bank is old and is still using its on-prem mainframes. These banks are also more likely to run their core banking systems in-house. An exception here is TymeBank, which uses the Mambu cloud-native platform, which runs on AWS.
Most banks in South Africa seem to be following a hybrid approach, layering public cloud over existing on-prem systems. Other than Tyme, most core banking platforms remain on mainframes, while customer facing workloads are being moved to AWS, Microsoft Azure and Google Cloud Platform (GCP). Absa and Standard Bank also use Salesforce, with Absa focusing on customer service and product sales, and Standard Bank using it to manage customer data. Oracle has a financial services footprint in Africa, and Standard Bank has partnered with Alibaba to support trade with China.
“Choosing the right mix isn't about public versus private,” says Richard Vester, chief executive of cloud at iOCO. “It's about the right workload and the right environment to get the right outcome.” This appears to be easier said than done, and architects at banks are spoiled for choice about where to put their workloads. What they will have to do is carefully test each hyperscaler offering to see what works best in their business. Among all the country’s banks, Capitec stands out as a singular success story. It now has the largest number of customers, at around 24mn, or 60% of the country’s adult population. And in late August, Capitec became the continent’s most valuable bank by market cap at R424bn, passing FirstRand. From its origin as a microlender, it has grown its customer base into higher LSMs, and has rolled out more products. Anecdotally, it’s a lot easier to take decisions at Capitec, particularly around technology, than it is at the behemothic banks.
If you use something like [Amazon] Block Express, you can pay half a million dollars for your storage a year.
Andrew Baker, Capitec
Capitec was the only bank invited to the keynote at AWS’ annual summit in Sandton in August. Capitec has invested heavily in AWS, with a data lakehouse that processes over 1.5trn reports per month, which gives the bank around 27TB of analytical data. All its customers make around 15 000 card payments every minute. It's also using machine learning service Sagemaker, and RedShift, the data warehouse service, from AWS.
Capitec’s CTO Andy Baker actually worked at AWS for a short stint in 2022 as director of engineering for EC2, after six years as Absa’s CTO, and before that, a decade at Barclays. He says some workloads don’t make financial sense if they’re running in the cloud, such as databases with high I/O and legacy SQL databases. “If you use something like [Amazon] Block Express, you can pay half a million dollars for your storage a year,” he says.
Storage in the cloud can be expensive, but so is maintaining large datacentres. He says another bank, which he doesn’t name, had almost 20 000 square metres of datacentre space, an approach he has no interest in replicating.
For him, the priority is speed and Capitec has grown at a remarkable pace in its 25 years of existence. Scaling quickly requires elasticity, and the cloud makes that possible. “There's absolutely no advantage in taking what you have on-premise and dropping it in the cloud, and actually it’s quite an expensive thing to do,” he says, adding that it’s exactly what Capitec did, which he says sounds like he’s contradicting himself.
Capitec had an issue around the fact that its production datacentre and its disaster recovery (DR) datacentre were 30 milliseconds apart, with one in Joburg and the other in Cape Town. “So that means if we want to go to DR, and one product fails, I have to move everything to that datacentre, and that’s a lot of downtime. Our business imperative, to shunt everything into AWS, is that we wanted three datacentres right next to each other.” This is what it found in the Amazon region in Cape Town. He says it believes in N+1 redundancy, which means that systems will have at least one backup component.
DR is not good enough in this day and age, he says. “You need to be able to lose a datacentre and still have resilience.”
The reason it chose AWS is that it’s in Cape Town, near Capitec’s physical datacentre, which “solves the physics problem”. He also prefers its redundancy model. “The way Microsoft works is if it fails in Joburg, it fails-[over] to Cape Town. We wouldn’t tolerate that; that wouldn’t be an acceptable outcome for us. If Azure goes down in Isando, its DR is Cape Town. That’s not a good place [to be]. Thirty milliseconds is going to break apart a lot of services. We want three availability zones right next to each other, five milliseconds, maximum, apart.”
Putting its cloud data processing capabilities to good use, Capitec is tackling one of the biggest problems faced by banks – fraud. The South African Banking Risk and Information Centre found that reported incidents of digital banking fraud doubled in 2024. That’s a big problem, says Baker, and a complicated one to solve. It’s also not something that can be solved using on-premise infrastructure because “you will not have the computational power and the tools and machinery to do inline inception of a payment”.
In the cloud, Capitec checks every payment against the national beneficiary database and uses AI to detect suspicious transactions in realtime. “Every payment from our 24mn clients for the last year has gone through that. We have a record of it, we know the references, we know everything,” he says. In practice, that scale gives Capitec the ability to do things it couldn’t on traditional infrastructure. Payments can be intercepted, slowed down if something looks risky, and only released once the models are satisfied. Clients get warned when they’re about to send money to a suspicious beneficiary, with the system flagging patterns linked to scams like invoice fraud. But there’s no point in protecting payments if the platform itself isn’t resilient. Uptime is an important aspect of banking, especially when an app is the primary way customers are interacting with their money. Baker says that Capitec now does blue/ green deployments during the day with zero downtime. This means it runs two identical production environments, which minimises risk.
If its telemetry shows any problems during an update, the system rolls back instantly to the previous version. In the past, he says engineers would get up at 2am to push through changes and hope everything worked. The phones would then start ringing at 6:30am, and: “You’d hear it hadn’t gone as well as hoped.” Today, updates are seamless, running in production without interruption. The impact has been dramatic: where Capitec once faced two or three major incidents in a week, now entire months can pass without disruption. The goal is zero downtime, a shift that has improved both engineering productivity and the client experience.
Hyperscalers will offer much better uptime than what a business can do by itself, but what happens when there’s a hyperscale outage? Would it not make sense to duplicate workloads in two clouds?
“You can’t really do it, practically,” he says, adding there’s a product called Crossplane that attempts to make a common control plain across all cloud providers.
Running the same workloads on two clouds will also weaken the bank’s security posture as it spans two identity and access management providers, which will mean you’ll “start bifurcating your security controls. And, with an outage, you can get yourself in a mess. “I always look at net risk, not gross risk. If the risk is too high, replicating to Azure doesn’t make that risk lower. When, actually, was the last time you used that pathway [to the other cloud]? And then you find it’s full of bugs and you’ll duplicate all your payments. Great harm comes from great intentions. If we were in Microsoft, we wouldn’t replicate to AWS either.”
* Article first published on brainstorm.itweb.co.za
Share