Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
check_ganglia
Meet The New Nagios Core Services Platform
Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.
Monitoring Made Magically Better
- Nagios Core on Overdrive
- Powerful Monitoring Dashboards
- Time-Saving Configuration Wizards
- Open Source Powered Monitoring On Steroids
- And So Much More!
estair@monitor02 libexec$ ./check_ganglia.pl --help
Unknown option: help
UNKNOWN: HOST not defined.
-H hostname/IP: of host to connect to gmetad/gmond on
-P Port: to connect to and retrieve XML
-O Output method: ('cluster' to dump all info | 'hostcheck' to grab for)
-T Targethost: when 'hostcheck', the host to pull data for
-M Metric: the 'gmetric' defined value to return exclusively
-w warn: int value above which the check will exit in a WARN state
-c crit: int value above which the check will exit in a CRITICAL state
Use plugin for a specific check-host command, determine if host has checked in to ganglia cluster recently:
define command{
command_name check-cluster-host-alive
command_line $USER1$/check_ganglia.pl -H $HOSTADDRESS$ -P 8600 -O hostcheck -T localhost -M host_state
}
.........
# pull data direct from host via XML call, using a specified query string (!!> 100x faster !!)
define command{
command_name check_ganglia_host_query
command_line $USER1$/check_ganglia.pl -host=localhost -port=8652 -output=hostcheck -cluster=$ARG1$ -targethost=$HOSTALIAS$ -M $ARG2$
}
define command{
command_name check_ganglia
command_line $USER1$/check_rrd_eli.pl "/var/ganglia/rrds/$HOSTGROUPNAME$/$HOSTNAME$.lucasfilm.com/$ARG1$" sum $ARG2$ $ARG3$
}
Unknown option: help
UNKNOWN: HOST not defined.
-H hostname/IP: of host to connect to gmetad/gmond on
-P Port: to connect to and retrieve XML
-O Output method: ('cluster' to dump all info | 'hostcheck' to grab for)
-T Targethost: when 'hostcheck', the host to pull data for
-M Metric: the 'gmetric' defined value to return exclusively
-w warn: int value above which the check will exit in a WARN state
-c crit: int value above which the check will exit in a CRITICAL state
Use plugin for a specific check-host command, determine if host has checked in to ganglia cluster recently:
define command{
command_name check-cluster-host-alive
command_line $USER1$/check_ganglia.pl -H $HOSTADDRESS$ -P 8600 -O hostcheck -T localhost -M host_state
}
.........
# pull data direct from host via XML call, using a specified query string (!!> 100x faster !!)
define command{
command_name check_ganglia_host_query
command_line $USER1$/check_ganglia.pl -host=localhost -port=8652 -output=hostcheck -cluster=$ARG1$ -targethost=$HOSTALIAS$ -M $ARG2$
}
define command{
command_name check_ganglia
command_line $USER1$/check_rrd_eli.pl "/var/ganglia/rrds/$HOSTGROUPNAME$/$HOSTNAME$.lucasfilm.com/$ARG1$" sum $ARG2$ $ARG3$
}
Reviews (1)
byharakiri, March 21, 2013
Two things are needed. First of all, you need to run "cpan DateTime::Format::Epoch::Unix" to install this module. If anyone knows an RPM on RHEL/Centos please leave a comment.
Next, the biggest problem, the correct exit status when treshhold is reached doesn't work. So I fixed it. I'm not a Perl programmer, so if it looks somewhat crude, please excuse me.
Copy/paste the following code and run it as a patch:
--- /tmp/check_ganglia.pl 2013-03-21 12:43:48.000000000 +0100
+++ check_ganglia.pl 2013-03-22 08:38:48.574700147 +0100
@@ -14,7 +14,6 @@
# TODO: call $cluster{host} hash directly instead of seeking within it.
# TODO: Fix some clusters that don't match host checks...
# TODO: add retval matching (range, string, etc)
-# TODO: fix warn/crit to measure returned metric
# TODO: !!! NEXT !!! call $cluster{host} hash directly instead of seeking within it.
# TODO: !!! NEXT !!! better, pass in cluster:host context for direct passing of XML
# TODO: !!! NEXT !!! use syntax localhost:8652 TCP
@@ -22,6 +21,7 @@
# TODO: !!! NEXT !!! Or, since that requires knowing the CLUSTER, make it option and check the hostname as key
# TODO: !!! NEXT !!! for the hash of each cluster found. Still reduces cpu/time drastically
#
+# 2013-03-22: Tom Kerremans: fixed warn/crit to measure returned metric, removed some obsolete notifications
###########
# core modules needed:
@@ -72,6 +72,7 @@
exit $ERRORS{'CRITICAL'};
}
+
sub isnumeric()
{
my ($x) = @_;
@@ -128,14 +129,14 @@
} #/ if hostcheck
if (defined($warn)) {
- print "WARN defined\n";
+ #print "WARN defined\n";
#if ( ! isnumeric($warn) ) { die "NOT NUMERIC \n"; }
#die "NOT NUMERIC \n" if ( ! isnumeric($warn) ) ;
die "## $warn is NOT NUMERIC \n" if $_ =~ s/[a-z]//;
}
if (defined($crit)) {
- print "CRIT defined\n";
+ #print "CRIT defined\n";
}
} #/ sub processargs
@@ -301,9 +302,9 @@
###: ELI: WTF did I put this in here for??
### DELETEME
-### FUNC: output_match
-#sub output_match {
-#my $output = shift;
+## FUNC: output_match
+sub output_match {
+my $output = shift;
# perform string regex match on retval:
# if ( "$output" =~ /.*$match.*/ ) {
@@ -314,17 +315,20 @@
# exit 2;
# }
-## perform range check for warn/crit values:
-# if ( "$output" >= "$crit" ) {
-# exit 1;
-# } elsif ( "$output" >= "$warn" ) {
-# exit 2;
-# } else {
-# exit 0;
-# }
+# perform range check for warn/crit values:
+ if ( "$output" >= "$crit" ) {
+ print "CRITICAL: $metric = $output higher than treshhold of $crit\n";
+ exit 2;
+ } elsif ( "$output" >= "$warn" ) {
+ print "WARNING: $metric = $output higher than treshhold of $warn\n";
+ exit 1;
+ } else {
+ print "OK: $metric = $output\n";
+ exit 0;
+ }
-#} #/sub
-### /FUNC: output_match
+} #/sub
+## /FUNC: output_match
#^^^^ ##: ELI: WTF did I put this in here for??
#^^^^ ## DELETEME
@@ -386,14 +390,14 @@
print "UNKNOWN: ($metric) not found in host XML! ","\n";
exit $ERRORS{'UNKNOWN'}
} else {
- print "OK: $metric = $host_metrics{$metric} \n";
- exit $ERRORS{'OK'};
+ &output_match ($host_metrics{$metric});
}
} # /unless ($metric)
} else {# /if ($hostname eq)
} # /if hostname loop through hash. We've exhausted input data, exit now:
+
} # /foreach $hostkey
# don't exit here, create exit at end of all arrays to be searched (after function exits searching the last hash)
Next, the biggest problem, the correct exit status when treshhold is reached doesn't work. So I fixed it. I'm not a Perl programmer, so if it looks somewhat crude, please excuse me.
Copy/paste the following code and run it as a patch:
--- /tmp/check_ganglia.pl 2013-03-21 12:43:48.000000000 +0100
+++ check_ganglia.pl 2013-03-22 08:38:48.574700147 +0100
@@ -14,7 +14,6 @@
# TODO: call $cluster{host} hash directly instead of seeking within it.
# TODO: Fix some clusters that don't match host checks...
# TODO: add retval matching (range, string, etc)
-# TODO: fix warn/crit to measure returned metric
# TODO: !!! NEXT !!! call $cluster{host} hash directly instead of seeking within it.
# TODO: !!! NEXT !!! better, pass in cluster:host context for direct passing of XML
# TODO: !!! NEXT !!! use syntax localhost:8652 TCP
@@ -22,6 +21,7 @@
# TODO: !!! NEXT !!! Or, since that requires knowing the CLUSTER, make it option and check the hostname as key
# TODO: !!! NEXT !!! for the hash of each cluster found. Still reduces cpu/time drastically
#
+# 2013-03-22: Tom Kerremans: fixed warn/crit to measure returned metric, removed some obsolete notifications
###########
# core modules needed:
@@ -72,6 +72,7 @@
exit $ERRORS{'CRITICAL'};
}
+
sub isnumeric()
{
my ($x) = @_;
@@ -128,14 +129,14 @@
} #/ if hostcheck
if (defined($warn)) {
- print "WARN defined\n";
+ #print "WARN defined\n";
#if ( ! isnumeric($warn) ) { die "NOT NUMERIC \n"; }
#die "NOT NUMERIC \n" if ( ! isnumeric($warn) ) ;
die "## $warn is NOT NUMERIC \n" if $_ =~ s/[a-z]//;
}
if (defined($crit)) {
- print "CRIT defined\n";
+ #print "CRIT defined\n";
}
} #/ sub processargs
@@ -301,9 +302,9 @@
###: ELI: WTF did I put this in here for??
### DELETEME
-### FUNC: output_match
-#sub output_match {
-#my $output = shift;
+## FUNC: output_match
+sub output_match {
+my $output = shift;
# perform string regex match on retval:
# if ( "$output" =~ /.*$match.*/ ) {
@@ -314,17 +315,20 @@
# exit 2;
# }
-## perform range check for warn/crit values:
-# if ( "$output" >= "$crit" ) {
-# exit 1;
-# } elsif ( "$output" >= "$warn" ) {
-# exit 2;
-# } else {
-# exit 0;
-# }
+# perform range check for warn/crit values:
+ if ( "$output" >= "$crit" ) {
+ print "CRITICAL: $metric = $output higher than treshhold of $crit\n";
+ exit 2;
+ } elsif ( "$output" >= "$warn" ) {
+ print "WARNING: $metric = $output higher than treshhold of $warn\n";
+ exit 1;
+ } else {
+ print "OK: $metric = $output\n";
+ exit 0;
+ }
-#} #/sub
-### /FUNC: output_match
+} #/sub
+## /FUNC: output_match
#^^^^ ##: ELI: WTF did I put this in here for??
#^^^^ ## DELETEME
@@ -386,14 +390,14 @@
print "UNKNOWN: ($metric) not found in host XML! ","\n";
exit $ERRORS{'UNKNOWN'}
} else {
- print "OK: $metric = $host_metrics{$metric} \n";
- exit $ERRORS{'OK'};
+ &output_match ($host_metrics{$metric});
}
} # /unless ($metric)
} else {# /if ($hostname eq)
} # /if hostname loop through hash. We've exhausted input data, exit now:
+
} # /foreach $hostkey
# don't exit here, create exit at end of all arrays to be searched (after function exits searching the last hash)