Aug 05

NRPE is great for getting plugin information from a remote host. We wanted to use it to get passive data regarding events, such as syslog entries that SEC had highlighted. This meant we needed two things: multi-line support and larger amounts of output.

Multi-line is already in NRPE 2.12 – this was added by Matthias Flacke last year. However, the limit for data is 1K.

We wanted to be able to bump that figure up to 16K. There’s a common.h variable which is called MAX_PACKETBUFFER_LENGTH which is set to 1024. We found we could increase this value and then more data was returned. But there were two problems with it:

  • it broke backwards compatibility
  • it increased the size of each packetk

The 2nd had an impact on the network. Instead of 1K packets being sent between client and server, we now got 16K packets sent, even if the data contained was small.

The first was worst: it meant you needed to update the client (check_nrpe) with the server (nrpe) at the same time, otherwise you’d get lots of NRPE errors in Nagios with only one change.

So we’ve designed a compatible way: we’ve added a new packet type called RESPONSE_PACKET_WITH_MORE.

The idea is that check_nrpe will see if the packet returned is of the type RESPONSE_PACKET_WITH_MORE. If so, it will read subsequent packets and append that to the existing data, until it gets a RESPONSE_PACKET. So to read 16K worth of data, check_nrpe reads 16 x 1K packets. Of course, only updated nrpe daemons will send this, so this remains fully backwards compatible with existing nrpe daemons.

The patch is here. We’ve also cleanup up some of the graceful_close calls.

Now the process to update your NRPE agents would be:

  1. update the central check_nrpe, then
  2. update your agents at your leisure

And you won’t get any alerts during this period!

Note: during testing, we found that the limit for returned data from some linux kernels was 4K, even though nrpe was coded with 16K as the limit. This is due to kernel limitations in using pipe() for the interprocess communication.

5 Responses to “Enhancing NRPE for large output”

  1. avatar Steffen Poulsen says:

    Hi Ton,

    We just tried out this patch, and it appears to work just as expected – very nice work :-)

    We were wondering – what is the current upstream status on this patch? Is it perhaps already in the NRPE trunk so that we could expect it around in 2.13?

    - Else we will gladly help you raise a voice to make this happen asap :-)

    Best regards, Steffen Poulsen

  2. avatar tonvoon says:

    Hi Steffen,

    I don’t think this has been applied upstream. I think Ethan is not keen on it because NRPE is a pretty basic protocol, which is true, so I guess without a clear direction there will continue to be little changes like this in future.

    Ton

  3. avatar arnaud says:

    very nice work !

    But i can’t use NRPE 0.8.0.2 for windows when i have patch the NRPE 2.12 on my nagios DEBIAN.

  4. avatar Daniel Wittenberg says:

    I’m having this issue too, and while the patch works for *nix clients, has anyone looked at adding this to NSClient too? NSClient just lets you define a string_length in the .ini file so you can change it to whatever, but obviously still suffers from the “big bang” theory of having to do all at once.

    Dan

  5. avatar dwittenberg says:

    Anyone looked at applying this to NSClient so it can communicate with Windows too?

Leave a Reply

Nagios © 1999-2011 Nagios Enterprises LLC. Nagios, the Nagios logo, and Nagios graphics are the servicemarks,
trademarks, or registered trademarks owned by Nagios Enterprises, LLC. All Rights Reserved.
Opsview © 2008-2011 Opsera Ltd. Opsview, the Opsview Logo, and Opsview graphics are the
trademarks or registered trademarks owned by Opsera Limited. All Rights Reserved.
preload preload preload