About
Subscribe
  • Home
  • /
  • TechForum
  • /
  • Cloud management lost its way, operations are paying the price

Cloud management lost its way, operations are paying the price

By Grant Friend, Manager, EMEA Portfolio Solutions at Nutanix
Johannesburg, 28 Jan 2026
Grant Friend, Manager, EMEA Portfolio Solutions at Nutanix.
Grant Friend, Manager, EMEA Portfolio Solutions at Nutanix.

Most organisations believe they have a cloud problem, but I believe the real problem is cloud management. Infrastructure did not suddenly become more complex because businesses adopted hybrid or multicloud strategies. Complexity crept in because environments changed faster than the tools used to run them. Over time, platforms were added, workloads spread and responsibility fragmented, while the promise of a single pane to manage it all quietly gave way to a collection of disconnected views.

What is striking is that most organisations did not lose control of their infrastructure after adopting a hybrid or multicloud approach. They lost control because their cloud management approaches never moved beyond offering partial, platform-specific panes onto an increasingly interconnected reality.

When visibility doesn’t translate into control

Most cloud management platforms offer visibility through dashboards, metrics, alerts and colourful charts that show what is happening across environments. That information is useful, but only up to a point.

What operations teams quickly discover is that visibility without context creates noise. Alerts arrive from multiple systems, each telling part of the story, while the harder question is not “what is happening?”, but “what do I do about it, and how quickly?”

This is where the idea of the “single pane of glass” starts to fall apart. In many cases, it simply means a unified view of one vendor’s environment. The moment workloads run elsewhere or span environments, the pane cracks and teams are back to logging into different tools to make sense of what is really going on.

A cloud management platform should not just show you your estate. It should act when pressure builds, make decisions when thresholds are reached and buy operations teams the time they need to fix problems properly.

The operational reality most platforms ignore

Infrastructure teams rarely struggle with infrastructure design; instead they struggle with time. They are expected to keep services stable, manage growth, control costs, support new initiatives and respond to incidents, often simultaneously. That pressure is only increasing as more data-intensive workloads, including early AI use cases, are introduced into environments that were already stretched.

When something goes wrong, it tends to happen at the worst possible moment, during a critical job or business process that cannot simply be paused.

Payroll is a simple example. It runs once a month, consumes significant resources for a short period, and must be completed, because if it fails, the consequences are immediate and visible (and the whole business will hear about it). In those moments, no one is interested in architectural diagrams or capacity metrics. They want the platform to cope, adapt, and give them breathing room to fix the underlying issue properly.

When a cloud management platform can respond automatically as resources come under pressure, everything changes. The problem still needs fixing, but the panic disappears, and that alone makes a material difference.

Governance as consistency, not constraint

On top of operational challenges, governance is also often misunderstood. It is still framed as restriction or approval, when in practice good governance is about consistency. As cloud estates grow more distributed, relying on people to enforce rules becomes unsustainable. Policies drift, exceptions creep in and decisions vary depending on who is asked and how busy they are at the time.

Embedding governance into the management layer removes that inconsistency. It ensures workloads are deployed, scaled and managed in line with agreed standards, regardless of where they run, while also allowing organisations to introduce self-service safely without sacrificing control.

This becomes important when automation is introduced. Automation is rarely “out of the box”. It requires scripting, design and upfront effort, which can feel daunting when teams are already stretched. What is often missed is that automation is not a recurring effort. Once the work is done, it does not need to be repeated, and the upfront investment is returned quietly over months and years. Most organisations are already automating informally through scripts and scheduled tasks. Formalising that effort is less about ambition and more about ensuring those actions are consistent, auditable and secure.

Cost awareness without cost panic

Then there's the requirement for the business to know what the cost is of workloads to the business. What we do know is that cost governance has matured quickly, but not always comfortably. As FinOps has gained traction, organisations have become far more aware of what they consume, yet awareness is often mistaken for restraint. In reality, understanding costs is less about throttling spending and more about making informed choices.

Most people recognise this instinctively in their personal lives. You do not open your bank statement to stop spending altogether. You open it to understand where the money went. I often joke that it does not stop me from buying Pokémon cards for my kids, but it does make me aware of how much I spent on them relative to food, fuel and everything else that month.

The same principle applies to infrastructure. When teams can see what workloads consume and what that consumption costs the business, conversations become more constructive, overprovisioning is easier to challenge, resources can be right-sized with confidence and growth can be planned deliberately rather than guessed.

Lessons from living with a hybrid reality

All of this is complex, and one of the clearest lessons from organisations managing complex estates effectively is that hybrid is no longer a transitional phase, but the operating model itself. Early enthusiasm for public cloud gave way to more pragmatic questions around cost predictability, data control and resilience, while on-premises environments evolved rather than disappeared. The outcome is not indecision, but balance.

What works in this model is not forcing everything into a single environment, but managing different environments consistently. Platforms that treat each location as a separate problem tend to add friction, whereas those that recognise them as variations of the same operational challenge reduce it. This is where cloud management needs to refocus: away from labels and architectural debates towards outcomes such as stability, predictability and the ability to respond calmly when things do not go to plan.

Refocusing on what matters

Cloud management lost its way when it became more about describing environments than running them, and the organisations that get it right use platforms that fade into the background, quietly enforcing governance, supporting automation and helping teams make better decisions under pressure.

What I have learnt from the customers that I speak to is that there are platforms built with consistency in mind, but what matters far more than any product name is that when they focus on the benefits derived from the features, and take the time to implement them, they gain true advantages from a cloud management platform. To me, an effective management platform helps teams regain control without slowing the business, particularly as infrastructure becomes increasingly distributed. In short, it’s all about balance.

Share