check_hadoop_balance.pl (Advanced Nagios Plugins Collection)

Submit review Recommend Print Visit

Rating

0 votes

Favoured:

Compatible With

Nagios 1.x
Nagios 2.x
Nagios 3.x
Nagios XI

Owner

harisekhon

Website

github.com/harisekhon/nagios-plugins

Download URL

github.com/harisekhon/nagios-plugins

Hits

28436

Meet The New Nagios Core Services Platform

Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.

Download Now

Monitoring Made Magically Better

Nagios Core on Overdrive
Powerful Monitoring Dashboards
Time-Saving Configuration Wizards
Open Source Powered Monitoring On Steroids
And So Much More!

Checks HDFS Space Balance used across datanodes is within thresholds. Lists the nodes out of balance in verbose mode

Part of the Advanced Nagios Plugins Collection, download it here:

https://github.com/harisekhon/nagios-plugins

./check_hadoop_balance.pl --help

Nagios Hadoop Plugin to check various health aspects of HDFS via the Namenode's dfsadmin -report

- checks % HDFS space used. Based off an earlier plugin I wrote in 2010 that we used in production for over 2 years. This heavily leverages HariSekhonUtils so code in this file is very short but still much tighter validated
- checks HDFS replication of blocks, again based off another plugin I wrote in 2010 around the same time as above and ran in production for 2 years. This code unifies/dedupes and improves on both those plugins
- checks HDFS % Used Balance is within thresholds
- checks number of available datanodes and if there are any dead datanodes

Originally written for old vanilla Apache Hadoop 0.20.x, updated for CDH 4.3 (2.0.0-cdh4.3.0)

Recommend you also investigate check_hadoop_cloudera_manager_metrics.pl (disclaimer I work for Cloudera but seriously it's good it gives you access to a wealth of information)

usage: check_hadoop_balance.pl [ options ]

-s --hdfs-space Checks % HDFS Space used against given warning/critical thresholds
-r --replication Checks replication state: under replicated blocks, corrupt blocks, missing blocks. Warning/critical thresholds apply to under replicated blocks. Corrupt and missing blocks if any raise critical since this means there is potentially data loss
-b --balance Checks Balance of HDFS Space used % across datanodes is within thresholds. Lists the nodes out of balance in verbose mode
-n --nodes-available Checks the number of available datanodes against the given warning/critical thresholds as the lower limits (inclusive). Any dead datanodes raises warning
-w --warning Warning threshold or ran:ge (inclusive)
-c --critical Critical threshold or ran:ge (inclusive)
--hadoop-bin Path to 'hdfs' or 'hadoop' command if not in $PATH
--hadoop-user Checks that this plugin is being run by the hadoop user (defaults to 'hdfs', falls back to trying 'hadoop' unless specified)
-h --help Print description and usage options
-t --timeout Timeout in secs (default: 10)
-v --verbose Verbose mode
-V --version Print version and exit

Reviews (0)

Be the first to review this listing!

Nagios, the Nagios logo, and Nagios graphics are the servicemarks, trademarks, or registered trademarks owned by Nagios Enterprises. All other servicemarks and trademarks are the property of their respective owner. The files and information on this site are the property of their respective owner(s). Nagios Enterprises makes no claims or warranties as to the fitness of any file or information on this website, for any purpose whatsoever. In fact, we officially disclaim all liability. We do, however, think these community contributions are pretty damn cool. Website Copyright © 2009-2025 Nagios Enterprises, LLC. All rights reserved.