Application performance management comes of age

By Craig De Lucchi for Computer Associates Africa

Johannesburg, 06 Apr 2005

True application performance measurement is a growing problem for today`s companies, many of which are based on on-demand platforms. Properly identifying the source of the performance problem is key to business success, writes Craig De Lucchi, Consultant at Computer Associates Africa.

He provides an insight into the modern options for monitoring, identifying trends and troubleshooting application performance issues.

The realm of true application performance measurement is virgin territory for many organisations. Although the subject has often been discussed and debated, it has remained a largely uncharted discipline within the IT industry.

But times are changing. The proliferation of server-based applications has created application response time problems that are often difficult to pinpoint.

What is needed is a mechanism to accurately identify the source of these bottlenecks and then analyse them in more detail to prevent reoccurrences.

While there are many methods to monitor, measure and manage the traffic traversing the network, this is not true application performance management.

Neither are the solutions that simply check on the health of the network "plumbing" - copper, glass or wireless - and the performance levels of network routers, switches and data lines.

Not good enough

What these solutions do is measure network response times - some by application, host address, time of day and combinations of the above - and not true application response times, as experienced by the end-user.

True application performance management required the input of more information. For example, we need to know if the servers are being overloaded, and what performance challenges are being experienced by the user population within their local environments.

We need to understand what happens when a transaction enters the user`s PC and starts executing one of the applications mentioned above.

For example, someone looking at performance issues from the narrow confines of the network perspective would not understand if ABAB errors in SAP were causing the user to hang constantly.

They could be missing an extremely important piece of information that could be impacting overall business.

Confusing application performance with network response times - the time it takes a piece of work (transaction) to get from the user`s PC to the system and back - could have serious consequences.

Poor response times could be misconstrued as network delays when, in fact, a poorly coded application will result in a number of cycles between client and host before that useful piece of work is completed.

Application response times

Application response times - and therefore the management of their performances - can be measured either actively or passively. Each has advantages and disadvantages:

* Active methods

These are defined by the process of sending an ICMP PING (a basic artificial transaction) through the system from point A to point B through a network to identify bottlenecks and traffic hold-ups.

Unfortunately, this method simply confirms "network latency" or, more commonly, "reachability" because it provides a basic understanding of the efficiency level of the system.

It is inaccurate because a server may handle ICMP packets and application data packets very differently.

Nevertheless, "network latency" is a viable - and inexpensive - measurement for many organisations because it gives a clear indication of impending network problems.

A second active method is the injection of sophisticated, artificial traffic into the system that mimics a particular application (such as e-mail or FTP).

This is a better method to detect "network latency" and can be expanded to include traffic simulation and the modelling of complete data communications networks.

These are both useful techniques for predicting infrastructure changes and their impact on network performance, node utilisation and - ultimately - the user experience.

* Passive methods

To fully understand the concept of application performance management, it is necessary to monitor information flow and measure the duration of each transaction and its response time - at each step on the network.

Standards such as ARM and AIC have been defined to undertake this level of performance measurement, but a key requirement is access to the application code.

A less intrusive method (in the absence of an in-depth knowledge of the code) of realising application response time measurement from beginning to end of a particular transaction is through the use of lightweight agents.

These are intelligent agents that understand specific applications such as terminal emulators, e-mail systems, ERP systems and are capable of measuring transaction performance at the end-user`s PC - from the end-user`s perspective.

These agents reside either on the server or desktop depending on the architecture (thin client or fat client) of the application and collect information locally for immediate processing and analysis.

A local agent allows more accurately data collection. For example, the breakdown of the total application response time can be illustrated to highlight the time spent on the local station, time on the wire and then processing time spent at the host - including the application server and database.

Based on this knowledge, IT managers can substitute "best guesses" and "thumb sucks" with hard numbers and information that is a far cry from the traditional but crude "network latency" test.

This is true application performance management and the dawn of a new era for network managers.

Editorial contacts