About
Subscribe

Cloud problems have nowhere left to hide

Agentic AI is forcing companies to confront uncomfortable truths about their cloud and DevOps environments, exposing weaknesses but also helping to fix them.
Kevin Naicker
By Kevin Naicker, Executive head of cloud solutions, DVT.
Johannesburg, 04 Jun 2026
Kevin Naicker, executive head of cloud solutions, DVT.
Kevin Naicker, executive head of cloud solutions, DVT.

Organisations can keep layering tools onto their environments, or they can confront the reality of how those environments operate. Those are the real options on the table right now.

Most large enterprises in South Africa have already crossed the cloud adoption threshold. The workloads are migrated. The platforms are in place. The pipelines are running. The conversation has moved from whether cloud works, to why it still feels harder than it should.

Local research reflects this tension. BMIT’s SA Cloud Market Report 2025 points to a fundamental shift in focus among South African enterprises, from basic infrastructure to balancing modernisation with cost control. Inefficient resource consumption, unexpected cloud costs and security have emerged as boardroom-level concerns – exactly the kinds of pressures that autonomous systems struggle to tolerate.

Agentic AI arrives at a moment when those questions can no longer be avoided. And it is far less forgiving than previous waves of automation.

Agentic AI ends the era of quiet workarounds

For years, cloud operations have relied on a hidden safety net: people. When systems behave in unexpected ways, someone knows which alert to ignore, which process to bypass, or which undocumented dependency not to touch.

We often describe this as experience. In reality, it is compensation for environments that were never fully understood in the first place.

Agentic systems only work when the environment makes sense as a whole.

Agentic AI changes that dynamic. Give an autonomous system an objective and it will act on the information available to it. It will not apply judgement based on institutional memory. It will not interpret ambiguity generously. It will do exactly what your operating model permits − no more, no less.

That is why agentic AI does not strengthen weak systems. It removes the buffer that was hiding them. But it also creates the conditions for fixing them properly because once those weaknesses are visible, they can no longer be deferred.

What makes an AI agent different

The term “agentic AI” is used loosely, so it is worth being precise about what it means in the context of cloud and DevOps.

A traditional AI model takes a question, produces an answer, and waits for the next input. An AI agent is given a goal, and it figures out the steps to get there. It reasons through a sequence of actions, calls external tools and APIs, evaluates results, adjusts its approach and continues until the objective is met. Where a chatbot answers, an agent thinks and acts.

In cloud operations, this distinction is significant. When an infrastructure anomaly surfaces at two in the morning, you do not need a model that describes the problem. You need a system that can diagnose it, correlate it with recent deployment changes, identify the root cause, and either remediate automatically or present a clear action plan to the on-call engineer, all within minutes.

Each of the three major cloud providers has built agent frameworks to enable exactly this kind of capability. AWS offers Bedrock Agents and AgentCore for enterprise-grade orchestration. Microsoft has shipped the Azure AI Foundry Agent Service, tightly integrated with the broader Microsoft 365 ecosystem. Google has released Vertex AI Agent Builder alongside an open-source Agent Development Kit.

The tooling is maturing fast across all three platforms, converging on a common set of principles: goal-oriented execution, tool integration, memory and governance guardrails.

The real issue is not tooling

Every organisation I speak to can explain parts of its cloud environment extremely well. Finance understands cost patterns. Security understands exposure. Engineering understands delivery pipelines. Architecture understands intended design.

But place those explanations side by side, and they rarely describe the same system.

As long as humans are manually stitching these views together, the gaps remain manageable. Once agentic AI enters the picture, those gaps become fault lines. Agentic systems only work when the environment makes sense as a whole. Without that coherence, they expose inconsistencies long before they deliver capability.

This fragmentation is particularly pronounced in South African enterprises that have built out cloud environments incrementally, often across AWS, Azure and GCP simultaneously, with different teams owning different platforms and no single unified view of the estate.

We already have the knowledge. It just isn’t usable.

Most enterprises are not short on information. They are overwhelmed by it.

Runbooks, architecture decisions, post-incident reviews, security policies and compliance frameworks already describe how environments should behave. What they do not do is describe that behaviour in a way that is consistent, structured and usable by autonomous systems.

Humans make this work by interpreting intent. We skim, infer and adapt. Agents cannot do that safely unless the knowledge they consume is explicit and unambiguous.

This is where many organisations underestimate the work ahead. Turning existing documentation into something agentic AI can rely on is not glamorous, but it is essential. Without it, automation remains fragile and unpredictable, and the promise of agentic AI stays exactly that: a promise.

Cyber security is the base, not the afterthought

This is the part that gets left out of most conversations about AI and cloud. Everyone wants to talk about the capability: the autonomous pipelines, the self-healing infrastructure, the intelligent cost optimisation. Fewer people want to talk about what happens when an autonomous system makes a decision inside a production environment without adequate governance in place.

In South Africa’s financial services and insurance sectors in particular, the stakes are significant. Regulatory obligations, data residency requirements and audit trails do not pause because an AI agent moved faster than the change management process anticipated.

Cyber security and governance cannot be bolted on after the fact. They need to be the foundation on which cloud, DevOps and AI capabilities are built, not a separate workstream managed by a different team on a different timeline.

Once autonomous systems start acting inside production environments, accountability can no longer be deferred. If an agent identifies an issue and remediates it, who owns the outcome? If it escalates or blocks a change, who is responsible? If it fails, where does the risk land?

These are not technical questions. They are governance questions. And they need answers before the agents start working, not after.

Why some organisations will move faster

The organisations that succeed with agentic AI will not be the ones adopting the most advanced tools. They will be the ones willing to examine how their environments actually operate, and honest enough to fix what they find.

They invest in structure before intelligence. They reduce ambiguity before granting greater autonomy to their systems. They treat cloud, DevOps and cyber security as a single integrated system rather than three separate budget lines managed by three separate teams. And they know that fixing this is not something you delegate to an engineering team. It is a decision that starts at the top.

Everyone else will still experiment. Some pilots will look impressive. Fewer will survive prolonged exposure to production realities.

Agentic AI is not the start of a new phase of cloud maturity. It is the moment when the gaps in the operating model become impossible to hide, and when the organisations willing to face them honestly will finally pull ahead.

Share