This is a summary of a question I posed on the nagios-users mailing list.
In a distributed environment, we want the slave Nagios servers to do the alerting. The Nagios documentation says that the master should do the notifications as this is the central point of control, but we think there are two major limitations:
- the slave is not autonomous – if the connection to the master goes, then no notifications are released
- with slaves in different countries and local operators, the paging of notifications shouldn’t come from a central server
However, we had a problem. Some forms of notifications should be run on the master, such as RSS (we will be releasing this separate addon next week) or helpdesk integration. Another limitation is that we only allow NSCA communication from the slaves to the master.
The mailing list came to the rescue. Thanks to all the people we got responses from.
Marc Powell suggested putting logic in the notification scripts on the slave to check if the master is up then forward to master, otherwise notify itself. But I discounted that because I’m opposed to putting “should I actually notify or not” type of logic in the notification scripts – there’s just extra code to support and this is what Nagios should be doing.
Patrick Morris has slaves set with notifications off and the slaves check whether the master is working ok. If this fails, then switch on notification on the slave. I like this idea, although it requires pager notifications on the master to be “distributed” so that it forwards requests to a slave for local dispatch. However, one possible problem was raised by Robert King in a thread called Forcing renotification of existing states, where switching on a slave’s global notifications will miss out on services that are already in a non-OK state at the time of the switchover.
After thinking about it some more, we decided to create multiple contacts per person. On the slaves, we have the usual contact called user, but with host/service-notification commands of email and pager (if desired). However, it got complicated on the master because the usual user contact had to have no notification options. We then created two extra contacts:
- user/distprofile – for only master generated notifications
- user/masterprofile – for the usual notifications on the master
It’s all a bit ugly. The big downside to this is that there are many more object definitions – for each contact group, we also needed to create corresponding /distprofile and /masterprofile ones too. However, since we generate the configuration, the pain is a one off.
Looking long term, we decided the solution is if Nagios implements some sort of contact profile, defined like this:
define contactprofile {
contact_name contact_name
contactgroups contactgroup_names
host_notification_options [d,u,r,f,n]
service_notification_options [w,u,c,r,f,n]
host_notification_commands command
service_notification_commands command
}
The contact is left with static definitions like email and pager, while the contactprofile can be sliced and diced for specific host/services. A lovely feature of this is that you can define warnings to be sent as email, whereas criticals are paged.
But that’s some way into the future…

Opsview is a leading Open Source application and network monitoring suite. Labs is where our engineers discuss new projects, new approaches and new frameworks they’re using.
Recent Comments