Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

Directory Tree

check_failover.py

Rating
2 votes
Favoured:
0
Hits
96748
Files:
FileDescription
check_failover.pycheck_failover.py
Nagios CSP

Meet The New Nagios Core Services Platform

Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.

Monitoring Made Magically Better

  • Nagios Core on Overdrive
  • Powerful Monitoring Dashboards
  • Time-Saving Configuration Wizards
  • Open Source Powered Monitoring On Steroids
  • And So Much More!
Nagios plugin to check if Red Hat cluster services has moved to another node.
Nagios plugin to check if Red Hat cluster services has moved to another node. This is not about checking if a service is actually running or not, but checking to see if it moved.

If it's a permanent move, you can just re-initialize.
Reviews (1)
byjng, March 3, 2014
This works for RedHat CMAN HA/Cluster, tested in RHEL 6.x
Plugin does a comparison between life output and reference output in a given file (DATA_FILE).

Original plugin does not work with newer clustat outputs which has changed it's layout.
To make it work (clustat version 3.0.12.1) again, modify at least the LOCATION-Line:

diff -s check_failover.py check_failover_improved.py
72c72
DATA_FILE = '/var/tmp/nagios_failover_check'

This has to be set anyway. Maybe you want to have it on a volatile directory?

99c99,102
#LOCATION = output.index(' ------- ---- ----- ------ ----- \n')
> #LOCATION = output.index(' ------- ---- ----- ------ ----- \n')
> LOCATION = output.index([x for x in output if x.startswith(' ------- ----')][0])
>
You can follow the original check or our quick hack which only checks the
first couple of characters. Its still not very elegant but we should not expect changes very often...


169,171c172,178
#sys.exit(STATE_CRITICAL)
> # CHANGE: create DATA_FILE if not found. if something has happened (e.g. crash) other monitor
> # and other node(s) will tell us.
> print 'no file found...initializing'
> initialize()

We let the plugin create the reference file if it does not exist. No need
to go interactive here.
If something has failed over and nagios will complain you have to check anyway. Issuing check_failover.py -i will make the check green again after you fixed the problem.