Home Directory Plugins Clustering and High-Availability check_zookeeper_znode.pl (Advanced Nagios Plugins Collection)

Search Exchange

Search All Sites

Nagios Live Webinars

Let our experts show you how Nagios can help your organization.

Contact Us

Phone: 1-888-NAGIOS-1
Email: sales@nagios.com

Login

Remember Me

Directory Tree

check_zookeeper_znode.pl (Advanced Nagios Plugins Collection) Popular

Rating
0 votes
Favoured:
0
Compatible With
  • Nagios 1.x
  • Nagios 2.x
  • Nagios 3.x
  • Nagios XI
Hits
755831
Nagios CSP

Meet The New Nagios Core Services Platform

Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.

Monitoring Made Magically Better

  • Nagios Core on Overdrive
  • Powerful Monitoring Dashboards
  • Time-Saving Configuration Wizards
  • Open Source Powered Monitoring On Steroids
  • And So Much More!
Flexible ZooKeeper znode check, useful for HBase, SolrCloud, Hadoop NameNode HA & JobTracker HA ZKFC or any other ZooKeeper based service.

Checks:

1. root znode ("/") exists ( we are successfully connected to ZooKeeper(s), tries all given ZooKeepers in turn )
2. given znode exists (useful for checking ephemeral znodes eg. HBase Master is reporting alive by holding ephemeral node in ZooKeeper)
3. given znode's literal contents, substring match (eg. server we expect is the Master. optional)
4. given znode's contents against regex (eg. one of the servers we expect is the Master, flexible, anchoring etc. optional)
5. given znode is not blank/empty (unless -d "" is intentionally specified)
6. given znode is ephemeral (optional)
7. given znode has children / no children znodes (optional. Useful when the child znodes are dynamic but you just need to check for their existence)
8. given znode's age against --warning/--critical thresholds (optional)
check_zookeeper_znode.pl

Part of the Advanced Nagios Plugins Collection, download it here:

https://github.com/harisekhon/nagios-plugins

================================================================================
Some useful examples:
================================================================================

* Check we have an active HBase Root Master (this is an ephemeral node that will disappear if Master is down):

check_hbase_master_znode.pl -H
or
check_zookeeper_znode.pl -H -z /hbase/master --ephemeral
________________________________________________________________________________

* Check we have an HBase Root RegionServer assigned:

check_hbase_root_regionserver_znode.pl -H
or
check_zookeeper_znode.pl -H -z /hbase/root-region-server
________________________________________________________________________________

* Check there are no HBase unassigned regions (should be blank hence -d ""):

check_hbase_unassigned_regions_znode.pl -H
or
check_zookeeper_znode.pl -H -z /hbase/unassigned -d "" --no-child-znodes
________________________________________________________________________________

* Check there are HBase Backup Masters:

check_hbase_backup_masters_znode.pl -H
or
check_zookeeper_znode.pl -H -z /hbase/backup-masters
________________________________________________________________________________

* Check given SolrCloud server is alive and holding it's ephemeral znode:

check_solrcloud_server_znode.pl -H -z /solr/live_nodes/:8983_solr
or
check_zookeeper_znode.pl -H -z /solr/live_nodes/:8983_solr --null --ephemeral
________________________________________________________________________________

* Check HDFS NameNode HA ZKFC is working

check_hadoop_namenode_ha_zkfc_znode.pl -H
or
check_zookeeper_znode.pl -H -z /hadoop-ha/nameservice1/ActiveStandbyElectorLock --ephemeral
________________________________________________________________________________

* Check MapReduce v1 JobTracker HA ZKFC is working

check_hadoop_jobtracker_ha_zkfc_znode.pl -H
or
check_zookeeper_znode.pl -H -z /hadoop-ha/logicaljt/ActiveStandbyElectorLock --ephemeral

================================================================================


usage: check_zookeeper_znode.pl [ options ]

-H --host ZooKeeper node(s) to connect to, should be a comma separated list of ZooKeepers the same as are configured on the ZooKeeper servers themselves (node1:2181,node2:2181,node3:2181). It takes longer to connect to 3 ZooKeepers than just one of them (around 5 secs per ZooKeeper specified + (session-timeout x any offline ZooKeepers) so you will need to increase --timeout). Connection order is deterministic and will be tried in the order specified unless --random-conn-order
-P --port Port to connect to on ZooKeepers for any nodes not suffixed with : (defaults to 2181)
-z --znode Znode to check exists. Useful for a variety of checks of ZooKeeper based services like HBase, SolrCloud, NameNode & JobTracker HA ZKFC
-d --data Check given znode contains specific data (optional). This is a partial substring match, for more control use --regex with anchors. Careful when specifying non-printing characters which may appear as ?, may need to use regex to work around them with ".+" to match any character
-r --regex Check given znode contains data matching this case insensitive regex (optional). Checked after --data
-n --null Do not check znode contents, use on null znodes such as SolrCloud /solr/live_nodes/:8983_solr as the API segfaults when trying to retrieve data for these null znodes
-e --ephemeral Check given znode is ephemeral (optional)
--child-znodes Check given znode to has child znodes (optional)
--no-child-znodes Check given znode to not have child znodes (optional)
-u --user User to connect with (Not tested. YMMV. optional)
-p --password Password to connect with (Not tested. YMMV. optional)
-w --warning Warning threshold or ran:ge (inclusive) for znode age (optional)
-c --critical Critical threshold or ran:ge (inclusive) for znode age (optional)
--random-conn-order Randomize connection order
--session-timeout ZooKeeper session timeout in secs (default: 2). This determines how long to wait for connection to downed ZooKeepers and affects total execution time
-h --help Print description and usage options
-t --timeout Timeout in secs (default: 10)
-v --verbose Verbose mode (-v, -vv, -vvv ...)
-V --version Print version and exit