Hardware requirements to monitor a larger (3000 device) network

network-monitoringperformance

I'm currently evaluating monitoring software for (by my standards) a larger network expected to grow to around 3000 devices. I'm finding data on the hardware requirements for scaling hard to come by. (Edit: the devices are satellite receivers monitored by SNMP, so require an agentless monitor. Our main concern is to identify failing devices, and we don't need a great deal of analysis.)

Tthe 3000 devices will have about 40 data points each, logged on a cycle of 5 to 10 minutes. At a 10 minute polling interval, that's 12,000 points per minute. That provides two sorts of load: CPU load for the polling application, and most critically, disk write load to store those datapoints.

I've looked at Solarwinds Orion, Zenoss, Zabbix, and OpenNMS. We have experience of Zenoss and Orion on smaller networks of a few hundred devices. My initial impressions are:

  • Zenoss doesn't have a very efficient RRD implementation, but allows us to scale horizontally by adding collectors, which store RRD data locally.
  • Orion allows us to add polling engines, but requires a shared SQL server for the performance data.
  • Zabbix claims to scale to this level, but I've not found any useful guidance. As it uses a database for performance data, database tuning is key.
  • OpenNMS looks like the performance leader, due to an optimized RRD implementation and support for grouping.

Does anybody have experience or performance data for monitoring this scale of network?

Best Answer

OpenNMS can do the job.

For that type of environment, the key will be CPU threads and something that can handle low latency disk writes. I would use a standalone server (versus a VM), provide 12 or more cores and plan around direct-attached storage that either has 6 or more spindles or can leverage SSDs for the OpenNMS RRD directories. OpenNMS can also be tuned on the data collection and logging fronts to make it more efficient. Reaching out to their professional services team to help with the install would be a good option.

Related Topic