For a few years now, EMA has been assessing a series of capabilities that tend to be orthogonal (set at right angles) to most approaches to network performance management; and to performance management in general. Traditional approaches, which are still the most dominant in the market, and especially dominant among platform vendors, have focused on monitoring events and collecting SNMP statistics through polling network and systems devices. These solutions use this information to isolate points of failure in the infrastructure on a component level that may or may not impact application service performance.
These types of traditional monitoring tools are, as a class, getting better due to better analysis across network and systems devices. In many respects, they represent the next generation of platform centric performance monitoring tools for integrated network and systems performance. But, as a class, they are still far from offering the complete picture and have two very significant limitations.
First of all, they depend on polling for information, and so cannot do real-time analysis (most still collect data every 10-or-15 minutes). Such a low frequency of information gathering makes sense since polling for performance information can in itself cause congestion, and in one or two instances EMA has documented private networks where better than 50% of the network performance issues came from enterprise performance management traffic. But this type of polling provides only limited visibility into service performance, and critical intermittent problems can sometimes go completely undetected.
A second problem with traditional performance management tools is then dont monitor actual application traffic. They monitor the infrastructure supporting it and through various levels of intelligence (and here the quality ranges vastly) deduce what the application impacts are on a specific component. This is something like studying storm systems by monitoring the treescertainly not irrelevant, but not the most direct approach.
Assessing AFM
AFM is the flip side of this approach. This is EMAs term for all those technologies that directly address application traffic at various layers (especially layers three-through-seven and above in the OSI stack). These include a whole host of technologies, but a short list would include:
Analytics for assessing the Quality of Experience (QoE) of applications and transactions from the end-user perspective. QoE should be able to provide a governing metric for assessing the impact of service performance from the point of view of the end user.
Transaction analysis at the data center , or other end of the infrastructure, to monitor transactions within the application server with interfaces into the database server and the performance of other data center elements such as storage. This is an area where a parallel and complementary set of capabilities come into play to round out the visibility of the applications performance through the heart of the data center to the end user in remote locations.
Analysis of flows from servers to end devices over IT infrastructure, while supporting packet-level drill-down analysis. In other words, end-to-end latencies of application traffic across the infrastructure with packet and protocol drill downboth for real-time analysis and for network forensics targeted at persistent and difficult-to-diagnose problems.
Analysis of traffic volumes through capabilities such as NetFlow or sFlow, or jFlow or IPFix, or through other types of adapters of probes. This helps to understand how applications are impacting the infrastructure in terms of capacity and potential congestion, and can also expose inappropriate practices such as backing up servers remotely in the middle of the work day, or even security breaches.