About Network Testing Labs

Contact Network Testing Labs
Independent Reviews of Network Hardware and Software


Enterprise-level Network Management & Monitoring

If an organization is serious about minimizing network downtime, it’s likely using one of these three network managers.
By Barry Nance

A network that measures downtime in millions of dollars per minute (or per second!) needs a serious, enterprise-level network management tool. Nothing less will do.

The ideal network management accurately discovers devices, computers and applications on the network. It works on networks of any size. It uses computing resources frugally (after all, it performs no data processing – it’s there to watch over the network). It can work within the framework of a global directory (LDAP, for example). It graphically depicts the entire network, subsets of the network and individual devices. It monitors the status and health of every device or computer on the network. It can glean its data from a variety of sources, including agents, probes, SNMP-enabled devices, log files and Windows performance files. It works as well with IPv6 as it does with IPv4. It can accept and use complex descriptions of thresholds. It can send alert notifications via e-mail, pager or text message to different individuals or groups depending on the nature of the problem, and it can escalate these notifications when the problem persists. It can perform root cause analysis to identify a problem device or computer that’s causing a cascade of network error messages. It supports every kind of device on the network. It can correct some problems automatically by restarting a process, resetting a port or running a script. It works within virtual environments and cloud-based environments. It integrates with help desk software and with other monitoring tools. It produces useful, easy-to-understand and timely reports. It’s highly scalable and reliable. And the ideal network management is easy to use.

We decided to review the top-end network management products. We invited four enterprise-level network management software vendors to submit their best products for review in our Alabama lab.

IBM sent us Tivoli Netcool/OMNIbus and Tivoli Network Manager IP Edition, CA Technologies sent us CA eHealth and CA NetQoS ReporterAnalyzer, and HP sent both the Windows and Linux versions of its Automated Network Management Suite.

BMC initially accepted our invitation, but then told us “… the only way to meaningfully participate is through a guided tour of the products in our environment” instead of sending us a product to review.

Picking a winner among these three network managers is impossible. Each one is a sophisticated, mature and highly capable tool for achieving maximum network availability, uptime and performance.

If you have a serious network, any one of these three network managers will help you quickly solve network problems and will save your organization megabucks.

HP Network Management
HP’s Automated Network Management Suite’s high points are its modularity, its ability to monitor service level compliance and its automation of many of a network engineer’s daily tasks – i.e., it’s scalable, it helps track actual vs. expected performance and it saves time. As we tested, we didn’t find any drawbacks in Automated Network Management Suite.

Automated Network Management Suite consists of Network Node Manager (NNM) and a spate of components and Smart Plug-ins (SPIs), including HP Network Automation, NNMi Integration Enablement, NNM iSPI Network Engineering Toolset, NNM iSPI Performance for Metrics, NNM iSPI Performance for Traffic and NNM iSPI Performance for Quality Assurance, NNM iSPI Performance for Traffic, NNM iSPI for IP Telephony, NNM iSPI for IP Multicast and NNM iSPI for Multiprotocol Label Switching (MPLS), all under an umbrella of network automation. Network Node Manager monitors for faults and network availability, while the performance-related plug-ins gather utilization data and monitor for specific devices, protocols and applications.

Automated Network Management Suite accurately discovered our network (noting all our network devices, servers and virtualized environments), tracked device status, processed SNMP alerts, graphically displayed our network, alerted us to problems, fixed problems automatically, gathered statistics and produced useful reports.

HP supplies over 2,000 Management Information Bases (MIBs) with Automated Network Management Suite. These cover a wide variety of network equipment from over 50 major hardware vendors, including routers, switches, bridges and repeaters.

Automated Network Management Suite captured some Layer 2 data, but for the most part it mapped Layer 3 details. Just a few of the myriads of these details were utilization and error percentages, total packets by category and by protocol, retransmits, server memory utilization and full-duplex utilization percentage.

Automated Network Management Suite collected network health data, analyzed the stored device status and event data and reported results in useful charts and graphs. The system's root-cause problem analysis was especially helpful in zeroing in on a specific device that was causing an outage or performance problem, while its path-analysis capability was similarly helpful in pinpointing problems and performance degradations involving network pathways and linkages.

Automated Network Management Suite's automatic baseline feature set alarm thresholds for us by analyzing collected device status and event data, thus giving it the ability to more realistically detect exceptions, faults and errors. After it created a baseline for our network, we manually added a few thresholds of our own. Automated Network Management Suite thereafter generated prompt and highly informational alarms, via pager or e-mail, to notify us when the thresholds were exceeded.

Automated Network Management Suite’s distributed architecture scales well to handle larger and more complex network environments. Automated Network Management Suite even monitored itself to ensure it's running normally. It paged our administrator and sent e-mail alerts if the self-monitor finds, for instance, that Network Node Manager, or its server, had died. We found that Automated Network Management Suite can initiate corrective actions, such as restarting a background process or resetting a router port.

The Web browser-based user interface is responsive, thoughtfully designed and highly configurable. Automated Network Management Suite provides a central console for controlling multiple Network Node Manager instances. This central console consolidated event management, performance monitoring and automated alert processing in the lab. Our network administrator used its high-level Visual Basic Script-like language to customize the Automated Network Management Suite’s behavior and display. We found the console dashboard’s network health indicators helpful and informative.

For business-oriented service level agreements (SLAs) we established, Automated Network Management Suite tracked our transactions, their network travel, their processing at the server and their storage in a database. Automated Network Management Suite gave us availability and response time details, and it alerted us when any of our SLA parameters were exceeded.

Automated Network Management Suite runs on Windows Server 2003, Windows Server 2008, Red Hat Enterprise Linux and Solaris.

IBM Network Management
Tivoli Netcool/OMNIbus is a network manager that consolidates network status and health data from a veritable plethora of multiple network domains and subnets. Netcool/OMNIbus supervises and manages network events across a network of virtually any size and complexity. Netcool/OMNIbus gets much of its data from Tivoli Network Manager IP Edition, which collects and stores data from network layers 2 and 3. Tivoli Network Manager’s stored network knowledge includes information about both physical and logical network connections. It accurately and helpfully recognized, for instance, virtual private network (VPN), virtual local area network (VLAN), asynchronous transfer mode (ATM), frame relay and multiprotocol label switching (MPLS) connections in addition to our physical, port-to-port device connections.

Together, Netcool/OMNIbus and Network Manager gave us a clear and accurate picture of the test networks we asked them to manage, no matter how complex. Through Netcool/OMNIbus and Network Manager, we configured quite sophisticated threshold tests, such as “Emit an alert if the San Francisco WAN link’s utilization exceeds 5% on Saturdays and Sundays, 20% after 8 PM during the week, 50% during weekdays or 75% at 10 AM and 2 PM on weekdays.”

For reliability’s sake, Netcool/OMNIbus and Network Manager monitored themselves and restarted automatically when we artificially caused a monitoring/management component to fail.

Netcool/OMNIbus and Network Manager support current and evolving standards, including ITILŪ, COBIT, eTOM, IPv4 and IPv6, and uses FIPS 140-2 approved cryptographic providers.

To our delight, Netcool/OMNIbus and Network Manager worked well in both mixed and pure environments when we confronted them with IPv4 and IPv6 packets.
We also noted that network-intensive organizations that use an operational support system (OSS) to track network inventory, provisioning services and the configuration of network components will appreciate Network Manager’s ability to integrate with an OSS.

Tivoli Netcool/OMNIbus and Tivoli Network Manager excelled at handling millions and even tens of millions of events per day in our tests. Moreover, for each network problem we artificially induced, Netcool/OMNIbus and Network Manager quickly and accurately sifted through and analyzed the events to distill root causes for us. Netcool/OMNIbus and Network Manager saved us the equivalent of hundreds of hours of network troubleshooting when it pinpointed the actual problem devices that were responsible for a cascade of network error messages. Netcool/OMNIbus and Network Manager even located a fault we caused in a backup data path. If the primary path had failed, the fault would’ve kept the backup path from taking over for the primary data path.

On the downside, Netcool/OMNIbus’ and Network Manager’s browser-based user interface, Netcool/Webtop, was somewhat cumbersome and not as responsive as we’d have liked. Netcool/Webtop is a Java application that displays dashboards of maps, charts, tables and event lists. To its credit, when we logged on as super-administrators, we could easily configure Netcool/Webtop to show just those dashboard components we wanted to see. However, the Netcool/Webtop user interface was a bit sluggish. In comparison, we’ve seen some complex AJAX-enabled (i.e., JavaScript-based) Web browser interfaces that were snappier and more responsive. IBM provides additional graphical tools in the form of Netcool/Desktop, a native Motif- or Windows-based client that presents an alternative view of network activity. Like Netcool/Webtop’s, Netcool/Desktop’s display is highly configurable.

IBM supplies more than 1,000 software-based Netcool Probes with Netcool/OMNIbus and Network Manager. These are lightweight agents we easily deployed across the far reaches of our network. Netcool Probes stand watch over a wide variety of network devices, servers and server processes, and they report status and health information to a central console. We also noted that organizations with vertical-market business applications can painlessly create Netcool Probes that can monitor the running of the business application to alert an administrator when, for example, the application crashes or it begins consuming excessive CPU resources. IBM ships over 600 MIBs with Netcool/OMNIbus.

Netcool/OMNIbus works hand-in-glove to automatically open and close trouble tickets in help desk trouble-ticket-tracking software such as Siebel, Peregrine and of course Tivoli Service Request Manager.

Netcool/OMNIbus and Network Manager run on Solaris, HP-UX, AIX, Windows Server 2003, Windows Server 2008, Red Hat Enterprise Linux, SLED, SuSE Linux Enterprise Server and VmWare ESX for Red Hat EL.

CA Network Management
eHealth’s and NetQoS ReporterAnalyzer’s strong suits are their ability to handle myriads of diverse device types, their ability to do predictive performance analysis and the wealth of useful reports they offer. If eHealth and NetQoS ReporterAnalyzer have a weakness, it’s their consumption of computing resources. You might need a somewhat faster server, for instance, on which to run eHealth and NetQoS ReporterAnalyzer.

eHealth is CA’s enterprise-level network monitoring and management tool for finding and fixing network faults, while NetQoS ReporterAnalyzer is a network traffic analysis tool that reveals how a particular type of traffic or a specific network node are exceeding thresholds.

At an interval we could configure, eHealth polled our network devices to collect status and health data. eHealth then used a patented set of highly complex algorithms to know which part of the network was failing or was likely to fail soon. This predictive analysis feature is a godsend for organizations that can little afford network downtime and that want to proactively stay ahead of potential network problems.

When eHealth detected a threshold breach that we created, it sent us e-mail and paged us. If we ignored the initial alerts, it escalated matters by e-mailing and paging a second tier of people. Alerts can be triggered for hard outages such as loss of communication with a device or when, for example, a WAN link exceeds a threshold because network utilization is higher than, say, 75%.

We could express quite complex thresholds with eHealth, which used CA’s Time-Over-Threshold (TOT) or Deviation-From-Normal (DFN) algorithms to keep false alarms to a minimum. We could specify that we wanted to be alerted if network utilization exceeded a threshold even once, or we could specify that we wanted to be alerted only if high network utilization persisted for a specified period of time.

eHealth’s dashboard display provided real-time status information for the network. eHealth also has a central console user interface that graphically depicts the entire network or any portion of it. Clicking on a yellow (minor alert) or red (major alert) network device drills down through eHealth’s data to reveal the nature of a problem as well as details about the problem. We liked that we could generate instant reports to help document the problem.

eHealth’s reports are informative, easy to understand and easy to produce. We used its reports to help troubleshoot problems, identify unusual network behavior for future investigation, document service level agreement (SLA) compliance and identify trends for capacity-planning purposes. Through the simple-to-use reports interface, we could select the network elements or groups of elements we wished to document, specify a chart type (Line, Bar, Stacked Line, etc.) and choose a calendar window such as “Today” or “Previous 7 Days.” We could also set up custom date and time ranges for our reports.

eHealth’s At-a-Glance Reports were our first line of defense when we needed to document a problem so we could collaboratively share the nature of the problem with other network engineers. At-a-Glance Reports provide a high-level, quick view of key data, including network utilization, server utilization (CPU, memory or hard disk), the identity of a failed application and network connectivity errors.

We found eHealth’s Trend Reports made quick work of capacity planning chores. For all or any part of the network and for whatever time period we wished, we could configure and schedule reports that showed exactly the device, computer, application or network behaviors we wanted to document. We used these reports initially to produce a baseline of the network. Then, over time, we used these reports’ graphs and charts to precisely identify utilization trends that revealed the upgrades we should plan for. We also set up a number of tabular reports to document uptime and availability as well as provide utilization statistics for billing (chargeback) purposes.

We particularly liked eHealth’s report customization features, which let us produce, for example, trend reports for a specific user group and/or specific set of network resources, such as databases.

Impressively, CA includes more than 5,000 MIBs in eHealth.

eHealth and NetQoS ReporterAnalyzer run on Windows Server 2003 and Solaris.

All three of these network managers – IBM Tivoli Netcool/OMNIbus and Tivoli Network Manager IP Edition, CA eHealth and CA NetQoS ReporterAnalyzer and HP Automated Network Management Suite – are top-of-the-line, mature and highly capable tools for ensuring maximum availability, uptime and performance.

Net Results

IBM Tivoli Netcool/OMNIbus and Network Manager 8.2
IBM Corporation
Software Group
Route 100
Somers, NY 10589
Starts at $18,000
Pros: handles tens of millions of events per day; quickly and accurately distills root causes
Cons: browser-based user interface was somewhat cumbersome and not as responsive as we’d have liked

CA eHealth and NetQoS ReporterAnalyzer 6.2
CA Technologies
One CA Plaza
Islandia, NY 11749
Starts at $50,000
Pros: supports myriads of diverse device types; does predictive performance analysis; offers a wealth of useful reports
Cons: higher than expected consumption of computing resources

HP Automated Network Management Suite 9.10
Hewlett-Packard Company
3000 Hanover Street
Palo Alto, CA 94304
NNMi starts at $3,000
Pros: its modularity, its ability to monitor service level compliance and its automation of many daily tasks
Cons: None

Testbed and Methodology
We evaluated each product in several different areas: Discovery and enumeration of devices and computers, support for a variety of device manufacturers and device types, global directory integration, graphical depiction of the network, monitoring of network node status (availability), performance and health, alerts and notifications when network problems occur, automated corrective actions, maintenance of trouble tickets (or integration with a help desk tool), support for virtualized environments and the production of useful, informative reports. In particular, we wanted these reports to establish baselines, show available and unavailable devices, log device availability histories, identify trends and help us spot conditions that could result in future network problems.

Our test environment consisted of six routed Fast Ethernet subnet domains that have T1, T3 and DSL links to the Internet. We installed the network monitoring software’s server component(s) on a 4-way HP Proliant computer alternately running Windows 2008 Server and Windows 2003 Server. The 50 client computers on our network were a mix of Windows XP, Windows 2003, Windows 2008, Windows 7, Windows Vista, Red Hat Linux and Macintosh platforms. Relational databases on the network were Oracle, Sybase Adaptive Server and Microsoft SQL Server. Web servers on the network were Internet Information Server (IIS) and Apache.

Copyright (c) 2012 Network Testing Labs


About Network Testing Labs

Contact Network Testing Labs