What's New in Nagios Core 3.x
Up To: Contents
See Also: Known Issues
Important: Make sure you read through the documentation and the FAQs at support.nagios.com before sending a question to the mailing lists.
Change Log
The change log for Nagios can be found online at http://www.nagios.org/development/history or in the Changelog file in the root directory of the source code distribution.
Changes and New Features
- Documentation:
- Doc updates - I'm slowly making my way through rewriting most all portions of the documentation. This is going to take a while, as (1) there's a lot of documentation and (2) writing documentation is not my favorite thing in the world. Expect some portions of the docs to be different than others for a while. I hope the changes I'm making will make things clearer/easier for new and seasoned Nagios users alike.
- Macros:
- New macros - New macros have been added, including: $TEMPPATH$, $LONGHOSTOUTPUT$, $LONGSERVICEOUTPUT$, $HOSTNOTIFICATIONID$, $SERVICENOTIFICATIONID$, $HOSTEVENTID$, $SERVICEEVENTID$, $SERVICEISVOLATILE$, $LASTHOSTEVENTID$, $LASTSERVICEEVENTID$, $HOSTDISPLAYNAME$, $SERVICEDISPLAYNAME$, $MAXHOSTATTEMPTS$, $MAXSERVICEATTEMPTS$, $TOTALHOSTSERVICES$, $TOTALHOSTSERVICESOK$, $TOTALHOSTSERVICESWARNING$, $TOTALHOSTSERVICESUNKNOWN$, $TOTALHOSTSERVICESCRITICAL$, $CONTACTGROUPNAME$, $CONTACTGROUPNAMES$, $CONTACTGROUPALIAS$, $CONTACTGROUPMEMBERS$, $NOTIFICATIONRECIPIENTS$, $NOTIFICATIONISESCALATED$, $NOTIFICATIONAUTHOR$, $NOTIFICATIONAUTHORNAME$, $NOTIFICATIONAUTHORALIAS$, $NOTIFICATIONCOMMENT$, $EVENTSTARTTIME$, $HOSTPROBLEMID$, $LASTHOSTPROBLEMID$, $SERVICEPROBLEMID$, $LASTSERVICEPROBLEMID$, $LASTHOSSTATE$, $LASTHOSTSTATEID$, $LASTSERVICESTATE$, $LASTSERVICESTATEID$. Two special on-demand time macros have also been added: $ISVALIDTIME:$ and $NEXTVALIDTIME:$.
- Removed macros - The old $NOTIFICATIONNUMBER$ macro has been deprecated in favor of new $HOSTNOTIFICATIONNUMBER$ and $SERVICENOTIFICATIONNUMBER$ macros.
- Changes - The $HOSTNOTES$ and $SERVICENOTES$ macros may now contain macros themselves, just like the $HOSTNOTESURL$, $HOSTACTIONURL$, $SERVICENOTESURL$ and $SERVICEACTIONURL$ macros.
- Macros are normally available as environment variables when check, event handler, notification, and other commands are run. This can be rather CPU intensive in large Nagios installations, so you can disable this behavior with the enable_environment_macros option.
- Macro information can be found here.
- Scheduled Downtime:
- Scheduled downtime entries are no longer stored in their own file (previously specified with a downtime_file directive in the main configuration file). Current and retained scheduled downtime entries are now stored in the status file and retention file, respectively.
- Comments:
- Host and service comments are no longer stored in their own file (previously specified with a comment_file directive in the main configuration file). Current and retained comments are now stored in the status file and retention file, respectively.
- Acknowledgement comments that are marked as non-persistent are now only deleted when the acknowledgement is removed. They were previously automatically deleted when Nagios restarted, which was not ideal.
- State Retention Data:
- Flap Detection:
- Added flap_detection_options directive to host and service definitions to allow you to specify what host/service states should be used by the flap detection logic (by default all states are used).
- Percent state change and state history are now retained and recorded even when flap detection is disabled.
- Hosts and services are immediately checked for flapping when flap detection is enabled program-wide.
- Hosts and services that are flapping when flap detection is disabled program-wide are now logged.
- More information on flap detection can be found here.
- External Commands:
- Added a new PROCESS_FILE external command to allow processing of external commands found in an external (regular) file. Useful for processing large amounts of passive checks with long output, or for scripting regular commands. More information can be found here.
- Custom commands may now be submitted to Nagios. Custom command names are prefixed with an underscore and are not processed internally by the Nagios daemon. They may, however, be processed by a loaded NEB module.
- The check_external_commands option is now enabled by default, which means Nagios is configured to check for external "commands out of the box". All 2.x and earlier versions of Nagios had this option disabled by default.
- Status Data:
- Contact status information (last notification times, notifications enabled/disabled, etc.) is now saved in the status and retention files, although it is not processed by the CGIs.
- Embedded Perl:
- Added new enable_embedded_perl and use_embedded_perl_implicitly variables to control use of the embedded Perl interpreter.
- Perl scripts/plugins can now explicitly tell Nagios whether or not they should be run under the embedded Pel interpreter. This is useful if you have troublesome scripts that don't function well under the ePN.
- More information about these new options can be found here.
- Adaptive Monitoring:
- The check timeperiod for hosts and services can now be modified on-the-fly with the appropriate external command (CHANGE_HOST_CHECK_TIMEPERIOD or CHANGE_SVC_CHECK_TIMEPERIOD). Look here for available adaptive monitoring commands.
- Notifications:
- A first_notification_delay option has been added to host and service definitions to (what else) introduce a delay between when a host/service problem first occurs and when the first problem notification goes out. In previous versions you had to use some mighty config-fu with escalations to accomplish this. Now this feature is available to normal mortals.
- Notifications are now sent out for hosts/services that are flapping when flap detection is disabled on a host- or service-specific basis or on a program-wide basis. The $NOTIFICATIONTYPE$ macro will be set to "FLAPPINGDISABLED" in this situation.
- Notifications can now be sent out when scheduled downtime start, ends, and is cancelled for hosts and services. The $NOTIFICATIONTYPE$ macro will be set to "DOWNTIMESTART", "DOWNTIMEEND", or "DOWNTIMECANCELLED", respectively. In order to receive notifications on scheduled downtime events, specify "s" or "downtime" in your contact, host, and/or service notification options.
- More information on notifications can be found here.
- Object Definitions:
- Service dependencies can now be created to easily define "same host" dependencies for different services on one or more hosts. (Read more)
- Extended host and service definitions (hostextinfo and serviceextinfo, respectively) have been deprecated. All values that from extended definitions have been merged with host or service definitions, as appropriate. Nagios 3 will continue to read and process older extended information definitions, but will log a warning. Future versions of Nagios (4.x and later) will not support separate extended info definitions.
- New hostgroup_members, servicegroup_members, and contactgroup_members directives have been added to hostgroup, servicegroup, and contactgroups definitions, respectively. This allows you to include hosts, services, or contacts from sub-groups in your group definitions.
- New notes, notes_url, and action_url have been added to hostgroup and servicegroup definition.
- Contact definitions have the new host_notifications_enabled, service_notifications_enabled, and can_submit_commands directives to better control notifications and determine whether or not they can submit commands through the web interface.
- Host and service dependencies now support an optional dependency_period directive. This allows you to limit the times during which dependencies are valid.
- The parallelize directive in service definitions is now deprecated and no longer used. All service checks are run in parallel in Nagios 3.
- There are no longer any inherent limitations on the length of host names or service descriptions.
- Extended regular expressions are now used if you enable the use_regexp_matching config option. Regular expression matching is only used in certain object definition directives that contain *, ?, +, or \..
- A new initial_state directive has been added to host and service definitions, so you can tell Nagios that a host/service should default to a specific state when Nagios starts, rather than UP or OK (which is still the default).
- Object Inheritance:
- You can now inherit object variables/values from multiple templates by specifying more than one template name in the use directive of object definitions. This can allow for some very powerful (and complex) inheritance setups. (Read more)
- Services now inherit contact groups, notification interval, and notification period from their associated host if not otherwise specified. (Read more)
- Host and service escalations now inherit contact groups, notification interval, and escalation timeperiod from their associated host or service if not otherwise specified. (Read more)
- String variables in host, service, and contact definitions can now be prevented from being inherited by specifying a value of "null" (without quotes) for the value of the variable. (Read more)
- Most string variables in local object definitions can now be appended to the string values that are inherited. This is quite handy in large configurations. (Read more)
- Performance Improvements:
- Add ability to precache object config files and exclude circular path detection checks from verification process. This can speed up Nagios start time immensely in large environments! Read more here.
- A new use_large_installation_tweaks option has been added that should improve performance in large Nagios installations. Read more about this here.
- A number of internal improvements have been made with regards to how Nagios deals with internal data structures and object (e.g. host and service) relationships. These improvements should result in a speedup for larger installations.
- New external_command_buffer_slots option has been added to allow you to more easily scale Nagios in large environments. For best results you should consider using MRTG to graph Nagios' usage of buffer slots over time.
- Plugin Output:
- Multiline plugin output is now supported for host and service checks. Hooray! The plugin API has been updated to support multiple lines of output in a manner that retains backward compatability with older plugins. Additional lines of output (aside from the first line) are now stored in new $LONGHOSTOUTPUT$ and $LONGSERVICEOUTPUT$ macros.
- The maximum length of plugin output has been increased to 4K (from around 350 bytes in previous versions). This 4K limit has been arbitrarily chosen to protect again runaway plugins that dump back too much data to Nagios.
- More information on the plugins, multiline output, and max plugin output length can be found here.
- Service Checks:
- Nagios now checks for orphaned service checks by default.
- Added a new enable_predictive_service_dependency_checks option to control whether or not Nagios will initiate predictive check of service that are being depended upon (in dependency definitions). Predictive checks help ensure that the dependency logic is as accurate as possible. (Read more)
- A new cached service check feature has been implemented that can significantly improve performance for many people Instead of executing a plugin to check the status of a service, Nagios can often use a cached service check result instead. More information on this can be found here.
- Host Checks:
- Host checks are now run in parallel! Host checks used to be run in a serial fashion, which meant they were a major holdup in terms of performance. No longer! (Read more)
- Host check retries are now performed like service check retries. That is to say, host definitions now have a new retry_interval that specifies how much time to wait before trying the host check again. :-)
- Regularly scheduled host checks now longer hinder performance. In fact, they can help to increase performance with the new cached check logic (see below).
- Added a new check_for_orphaned_hosts option to enable checks of orphaned host checks. This is need now that host checks are run in parallel.
- Added a new enable_predictive_host_dependency_checks option to control whether or not Nagios will initiate predictive check of hosts that are being depended upon (in dependency definitions). Predictive checks help ensure that the dependency logic is as accurate as possible. (Read more)
- A new cached host check feature has been implemented that can significantly improve performance for many people Instead of executing a plugin to check the status of a host, Nagios can often use a cached host check result instead. More information on this can be found here.
- Passive host checks that have a DOWN or UNREACHABLE result can now be automatically translated to their proper state from the point of view of the Nagios instance that receives them. This is very useful in failover and distributed monitoring setups. More information on passive host check state translation can be found here.
- Passive host checks normally put a host into a HARD state. This can now be changed by enabling the passive_host_checks_are_soft option.
- Freshness checks:
- A new freshness_threshold_latency option has been added to allow to you specify the number of seconds that should be added to any host or service freshness threshold that is automatically calculated by Nagios.
- IPC:
- The IPC mechanism that is used to transfer host/service check results back to the Nagios daemon from (grand)child processes has changed! This should help to reduce load/latency issues related to processing large numbers of passive checks in distributed monitoring environments.
- Check results are now transferred by writing check results to files in directory specified by the check_result_path option. Files that are older than the max_check_result_file_age option will be mercilessly deleted without further processing.
- Timeperiods:
- Timeperiods were overdue for a major overhaul and have finally been extended to allow for date exceptions, skip dates (every 3 days), etc! This should help you out when defining notification timeperiods for pager rotations.
- More information on the new timeperiod directives can be found here and here.
- Event Broker:
- Updated NEB API version
- Modified callback for adaptive program status data
- Added callback for adaptive contact status data
- Added precheck callbacks for hosts and services to allow modules to cancel/override internal host/service checks.
- Web Interface:
- The main splash pages of the web interface are now PHP pages. This will require that you install/enable PHP support on your system if it isn't already.
- Hostgroup and servicegroup summaries now show important/unimportant problem breakdowns like the TAC CGI.
- Minor layout changes to host and service detail views in extinfo CGI.
- New check statistics and have been added to the "Performance Info" screen.
- Added Splunk integration options to various CGIs. Integration is controlled by the enable_splunk_integration and splunk_url options in the CGI config file.
- Added new notes_url_target and action_url_target options to control what frame notes and action URLs are opened in.
- Added new lock_author_names option to prevent alteration of author names when users submit comments, acknowledgements, and scheduled downtime.
- Debugging Info:
- The DEBUGx compile options available in the configure script have been removed.
- Debugging information can now be written to a separate debug file, which is automatically rotated when it reaches a user-defined size. This should make debugging problems much easier, as you don't need to recompile Nagios. Full support for writing debugging information to file is being added during the alpha development phase, so it may not be complete when you try it.
- Variables that affect the debug log are debug_file, debug_level, debug_verbosity, and max_debug_file_size.
- Update Checks:
- Nagios will now check approximately once a day to see if a new version is available. This is useful to keep on top of security patches and new releases. Update notices will appear in the web interface.
- Variables that affect the update check are check_for_updates and bare_update_check.
- Misc:
- Temp path variable - A new temp_path variable has been added to specify a scratch directory that Nagios can use for temporary scratch space.
- Unique notification and event ID numbers - A unique ID number is now assigned to each host and service notification. Another unique ID is now assigned to all host and service state changes as well. The unique IDs can be accessed using the following respective macros: $HOSTNOTIFICATIONID$, $SERVICENOTIFICATIONID$, $HOSTEVENTID$, $SERVICEEVENTID$, $LASTHOSTEVENTID$, $LASTSERVICEEVENTID$.
- New macros - A few new macros (other than those already mentioned elsewhere above) have been added. They include $HOSTGROUPNAMES$, $SERVICEGROUPNAMES$, $HOSTACKAUTHORNAME$, $HOSTACKAUTHORALIAS$, $SERVICEACKAUTHORNAME$, and $SERVICEACKAUTHORALIAS$.
- Reaper frequency - The old service_reaper_frequency variable has been renamed to check_result_reaper_frequency, as it is now also used to process host check results.
- Max reaper time - A new max_check_result_reaper_time variable has been added to limit the amount of time a single reaper event is allowed to run.
- Fractional intervals - Fractional notification and check intervals (e.g. "3.5" minutes) are now supported in host, service, host escalation, and service escalation definitions.
- Escaped command arguments - You can now pass bang (!) characters in your command arguments by escaping them with a backslash (\). If you need to include backslashes in your command arguments, they should also be escaped with a backslash.
- Multiline system command output - Nagios will now read multiple lines out output from system commands it runs (notification scripts, etc.), up to 4K. This matches the limits on plugin output mentioned earliar. Output from system commands is not directly processed by Nagios, but support for it is there nonetheless.
- Better scheduling information - More detailed information is given when Nagios is executed with the -s command line option. This information can be used to help reduce the time it takes to start/restart Nagios.
- Aggregated status file updates - The old aggregate_status_updates option has been removed. All status file updates are now aggregated at a minimum interval of 1 second.
- New performance data file mode - A new "p" option has been added to the host_perfdata_file_mode and service_perfdata_file_mode options. This new mode will open the file in non-blocking read/write mode, which is useful for pipes.
- Timezone offset - A new use_timezone option has been added to allow you to run different instances of Nagios in timezones different from the local zone.