Java – Monitoring C++ applications

We're implementing a new centralized monitoring solution (Zenoss). Incorporating servers, networking, and Java programs is straightforward with SNMP and JMX.

The question, however, is what are the best practices for monitoring and managing custom C++ applications in large, heterogenous (Solaris x86, RHEL Linux, Windows) environments?

Possibilities I see are:

Net SNMP

Advantages

single, central daemon on each server
well-known standard
easy integration into monitoring solutions
we run Net SNMP daemons on our servers already

Disadvantages:

complex implementation (MIBs, Net SNMP library)
new technology to introduce for the C++ developers

rsyslog

Advantages

single, central daemon on each server
well-known standard
unknown integration into monitoring solutions (I know they can do alerts based on text, but how well would it work for sending telemetry like memory usage, queue depths, thread capacity, etc)
simple implementation

Disadvantages:

possible integration issues
somewhat new technology for C++ developers
possible porting issues if we switch monitoring vendors
probably involves coming up with an ad-hoc communication protocol (or using RFC5424 structured data; I don't know if Zenoss supports that without custom Zenpack coding)

Embedded JMX (embed a JVM and use JNI)

Advantages

consistent management interface for both Java and C++
well-known standard
easy integration into monitoring solutions
somewhat simple implementation (we already do this today for other purposes)

Disadvantages:

complexity (JNI, thunking layer between native C++ and Java, basically writing the management code twice)
possible stability problems
requires a JVM in each process, using considerably more memory
JMX is new technology for C++ developers
each process has it's own JMX port (we run a lot of processes on each machine)

Local JMX daemon, processes connect to it

Advantages

single, central daemon on each server
consistent management interface for both Java and C++
well-known standard
easy integration into monitoring solutions

Disadvantages:

complexity (basically writing the management code twice)
need to find or write such a daemon
need a protocol between the JMX daemon and the C++ process
JMX is new technology for C++ developers

CodeMesh JunC++ion

Advantages

consistent management interface for both Java and C++
well-known standard
easy integration into monitoring solutions
single, central daemon on each server when run in shared JVM mode
somewhat simple implementation (requires code generation)

Disadvantages:

complexity (code generation, requires a GUI and several rounds of tweaking to produce the proxied code)
possible JNI stability problems
requires a JVM in each process, using considerably more memory (in embedded mode)
Does not support Solaris x86 (deal breaker)
Even if it did support Solaris x86, there are possible compiler compatibility issues (we use an odd combination of STLPort and Forte on Solaris
each process has it's own JMX port when run in embedded mode (we run a lot of processes on each machine)
possibly precludes a shared JMX server for non-C++ processes (?)

Is there some reasonably standardized, simple solution I'm missing?

Given no other reasonable solutions, which of these solutions is typically used for custom C++ programs?

My gut feel is that Net SNMP is how people do this, but I'd like other's input and experience before I make a decision.

Best Answer

I'm not super familiar with Zenoss but when I used to used nagios for this sort of thing we'd make the c/c++ process listen on a socket and write a custom nagios plugin which would hand over diagnostic and status information.

First step is to choose the lib you want to use to make your process listen.. Something like C++ Socket Library will do for that. Nothing complicated there.. just make the process listen.

Then you have to define the response your process will send given a particular stimulus. This really meant (at least with nagios) defining the 'service' and then sending the process the signal that corresponded to that service. The simplest thing you can do is create a 'process ping' just see if you can successfully connect to the running process. If you do than the custom nagios plugin knows at least the process is still alive.

There's much more sophisticated stuff you can do but the idea is simple enough. You can write your own little lib of process listening code encapsulated within objects and pull it into your custom c++ stuff in a standardized manner whenever you build one (or all) your executables

My understanding is Zenoss can do this too.

Probably since Zenoss is python then you'll write your custom plugin for it using something like Twisted for connecting to your listening c++ executable.

Best Answer

Related Solutions

C++ – Rewriting IBM Assembler and COBOL Code

Java – Cross-compiling Java app to run directly on ARM

Related Topic