Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
Check Netapp NG
Meet The New Nagios Core Services Platform
Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.
Monitoring Made Magically Better
- Nagios Core on Overdrive
- Powerful Monitoring Dashboards
- Time-Saving Configuration Wizards
- Open Source Powered Monitoring On Steroids
- And So Much More!
! Please update line 15 in the script according to your nagios installation
At the time there were some plugins for netapp check but they didnt work like I wanted so this is my version of it.
It will check the following:
* Temperature
* Fan Fail
* Power Supply Fail
* CPU Load
* NVram Battery Status
* Vol Usage Precentage
* Snapshot Config
* Shelf Health
* Number of ndmp sessions
* Global Status of the filer
* Number of failed disks
* Show's uptime
* Cache Age
I hope you will find this useful as I did, But I dont guarantee it will work for you
At the time there were some plugins for netapp check but they didnt work like I wanted so this is my version of it.
It will check the following:
* Temperature
* Fan Fail
* Power Supply Fail
* CPU Load
* NVram Battery Status
* Vol Usage Precentage
* Snapshot Config
* Shelf Health
* Number of ndmp sessions
* Global Status of the filer
* Number of failed disks
* Show's uptime
* Cache Age
I hope you will find this useful as I did, But I dont guarantee it will work for you
Reviews (11)
byKoobal, May 5, 2015
Hi,
First of all thanks for the plugins which work great.
We just have a little problem with the diskused option.
We are monitoring about 70 volumes on a fas8020 CDOT.
Sometimes, all volumes are turning red with an out of bound return code. Those false positive are resorbed with a recheck.
I tried to change the SNMP timeout session as ebardellidoxee says :
1) Manage SNMP timeout.
my ($sess, $err) = Net::SNMP->session( -hostname => $server, -version => $version, -community => $comm, -timeout=> 60);
If you don’t specify any value, the default is 5 secs and in a busy environment it misses lots of packets. You might add a flag that populate the –timeout parameter of the Net::SNMP->session
But when I change it I got a general error of this plugin.
It seems that the timeout parameter for snmp session is not good.
Did someone encounter the same problem and how to solve it ?
Thanks,
regards
First of all thanks for the plugins which work great.
We just have a little problem with the diskused option.
We are monitoring about 70 volumes on a fas8020 CDOT.
Sometimes, all volumes are turning red with an out of bound return code. Those false positive are resorbed with a recheck.
I tried to change the SNMP timeout session as ebardellidoxee says :
1) Manage SNMP timeout.
my ($sess, $err) = Net::SNMP->session( -hostname => $server, -version => $version, -community => $comm, -timeout=> 60);
If you don’t specify any value, the default is 5 secs and in a busy environment it misses lots of packets. You might add a flag that populate the –timeout parameter of the Net::SNMP->session
But when I change it I got a general error of this plugin.
It seems that the timeout parameter for snmp session is not good.
Did someone encounter the same problem and how to solve it ?
Thanks,
regards
byottow, November 11, 2014
The FAILEDDISK command does not work on my FAS2220, ONTAP 8.1.3. Via telnet, the disk show command says I have one failed disk, yet this check insists there are no failed disks. Manual snmpwalk on the OIDs (1.3.6.1.4.1.789.1.6.4 and below) confirms the system thinks there are no failed disks when queried via SNMP. Possibly a bug in ONTAP?
bylboclboc, August 25, 2014
Looks promising but I have not found that it can support 64-bit values which is a must for large volumes.
bygmarat82, August 25, 2014
Hi
Thank you very much for that plugin. It seems the best for NetApp for me.
I have only one issue: I can't get proper volume space status output.
I'm entering
./check-netapp-ng.pl -H -C fpx_snmp -T DISKUSED --vol /vol/home/
And all I get is the pipe symbol:
|
?Any clue??
Thank you very much for that plugin. It seems the best for NetApp for me.
I have only one issue: I can't get proper volume space status output.
I'm entering
./check-netapp-ng.pl -H -C fpx_snmp -T DISKUSED --vol /vol/home/
And all I get is the pipe symbol:
|
?Any clue??
byrjholtz, September 16, 2013
We were using one of the derivatives of the original check_netapp perl scripts, and the DISKUSED was always timing out and throwing errors. Running Wireshark showed that each check was pulling down all the data from each and every volume and .snapshot. This 'Check Netapp NG' script (for the DISKUSED at least) pulls down the list of all the volumes and .snapshots, but only pulls the Total, Used, and Free of the single volume in question. All our DISKUSED checks are now working correctly!
Best of all, no more flood of false critical alerts texting my phone!
Best of all, no more flood of false critical alerts texting my phone!
byebardellidoxee, July 25, 2013
Hi Ranl,
First of all congratulations for this plugin, it works fine and it’s been really useful to me.
I replaced the previous plugin I was using (check_naf.py) with yours because it was no more reliable generating lots of false positives , however I needed to make some fix and write a new function to replicate some functionalities I loved.
I’m writing to give you a couple of hints, that might improve a little your already really good plugin.
1) Manage SNMP timeout.
my ($sess, $err) = Net::SNMP->session( -hostname => $server, -version => $version, -community => $comm, -timeout=> 60);
If you don’t specify any value, the default is 5 secs and in a busy environment it misses lots of packets. You might add a flag that populate the –timeout parameter of the Net::SNMP->session
2) Negative values when vol > 2TB
Perl’s Net::SNMP returns negative wrong values when the volume check is bigger than 2TB to fix it I needed to added the following workaround
my $snmp_netapp_volume_id_table_df_total = "$snmp_netapp_volume_id_table_df.3";
my $snmp_netapp_volume_id_table_df_total_h = "$snmp_netapp_volume_id_table_df.14";
my $snmp_netapp_volume_id_table_df_total_l = "$snmp_netapp_volume_id_table_df.15";
my $snmp_netapp_volume_id_table_df_used = "$snmp_netapp_volume_id_table_df.4";
my $snmp_netapp_volume_id_table_df_used_h = "$snmp_netapp_volume_id_table_df.16";
my $snmp_netapp_volume_id_table_df_used_l = "$snmp_netapp_volume_id_table_df.17";
my $EXP=4294967296;
…
…
…
### into the function ###
if ($total
First of all congratulations for this plugin, it works fine and it’s been really useful to me.
I replaced the previous plugin I was using (check_naf.py) with yours because it was no more reliable generating lots of false positives , however I needed to make some fix and write a new function to replicate some functionalities I loved.
I’m writing to give you a couple of hints, that might improve a little your already really good plugin.
1) Manage SNMP timeout.
my ($sess, $err) = Net::SNMP->session( -hostname => $server, -version => $version, -community => $comm, -timeout=> 60);
If you don’t specify any value, the default is 5 secs and in a busy environment it misses lots of packets. You might add a flag that populate the –timeout parameter of the Net::SNMP->session
2) Negative values when vol > 2TB
Perl’s Net::SNMP returns negative wrong values when the volume check is bigger than 2TB to fix it I needed to added the following workaround
my $snmp_netapp_volume_id_table_df_total = "$snmp_netapp_volume_id_table_df.3";
my $snmp_netapp_volume_id_table_df_total_h = "$snmp_netapp_volume_id_table_df.14";
my $snmp_netapp_volume_id_table_df_total_l = "$snmp_netapp_volume_id_table_df.15";
my $snmp_netapp_volume_id_table_df_used = "$snmp_netapp_volume_id_table_df.4";
my $snmp_netapp_volume_id_table_df_used_h = "$snmp_netapp_volume_id_table_df.16";
my $snmp_netapp_volume_id_table_df_used_l = "$snmp_netapp_volume_id_table_df.17";
my $EXP=4294967296;
…
…
…
### into the function ###
if ($total
bygm_rajkumar, March 20, 2013
I can able to see the results of disk used details
./check-netapp-ng.pl -H 172.16.4.16 -C public -T DISKUSED --vol /vol/vol_testlab/
OK: DISKUSED 1% | /vol/vol_testlab/= 23365520k
However i couldn't get the results for below query, correct me if im wrong.
./check-netapp-ng.pl -H 172.16.4.16 -C public -T DISKUSED --vol /vol/vol_testlab/qtree_testlab
|
NO RESULTS DISPLAYED....
I couldn't get the subfolder disk information from this plugin. let me know how to get those details..Thanks in Advance!!
./check-netapp-ng.pl -H 172.16.4.16 -C public -T DISKUSED --vol /vol/vol_testlab/
OK: DISKUSED 1% | /vol/vol_testlab/= 23365520k
However i couldn't get the results for below query, correct me if im wrong.
./check-netapp-ng.pl -H 172.16.4.16 -C public -T DISKUSED --vol /vol/vol_testlab/qtree_testlab
|
NO RESULTS DISPLAYED....
I couldn't get the subfolder disk information from this plugin. let me know how to get those details..Thanks in Advance!!
If you want ot monitor your NetApp then use this. It works great on our NetApp 3170
byMab, December 28, 2012
Hi,
I am trying to configure the Shelf part of the plugin. the rest works really well.
but first problem i have come across is that i dont really understand what all the items mean.
for example this is the output from the SHELF check.
VoltOverFail ,
VoltUnderFail
TempUnderFail
PsFail ,
TempOver 1, 2, 3
ElectFail 1, 2
VoltUnderWarn
VoltOverWarn
FanFail 1, 2
TempUnderWarn
TempOverFail 3
VoltOverFail 24, 24
what do all the numbers mean next to the different parts and does anyone know why some are blank or have N/A
I am trying to configure the Shelf part of the plugin. the rest works really well.
but first problem i have come across is that i dont really understand what all the items mean.
for example this is the output from the SHELF check.
VoltOverFail ,
VoltUnderFail
TempUnderFail
PsFail ,
TempOver 1, 2, 3
ElectFail 1, 2
VoltUnderWarn
VoltOverWarn
FanFail 1, 2
TempUnderWarn
TempOverFail 3
VoltOverFail 24, 24
what do all the numbers mean next to the different parts and does anyone know why some are blank or have N/A
bysixarm, July 18, 2012
This worked out of the box with my NetApp FAS2040 & FAS2240 systems.
I had a bit of trouble getting the Volume Usage check to work properly, but then realised I wasn't specifying the container correctly. Below is how I got it to work in the end.
./check-netapp-ng.pl -H -C -T DISKUSED -w 80 -c 90 --vol /vol/v_datastore_vdi1/
OK: DISKUSED 41% | /vol/v_datastore_vdi1/=30458796k
I had a bit of trouble getting the Volume Usage check to work properly, but then realised I wasn't specifying the container correctly. Below is how I got it to work in the end.
./check-netapp-ng.pl -H -C -T DISKUSED -w 80 -c 90 --vol /vol/v_datastore_vdi1/
OK: DISKUSED 41% | /vol/v_datastore_vdi1/=30458796k
Hi,
Just to tell you that is a very good perl script !
Awesome work !
I recommended !!
For others, some informations :
(Examples of my commands.cfg)
define command{
command_name check_netapp_cpu
command_line /usr/bin/perl5.8.8 $USER1$/check-netapp-ng.pl -H $HOSTADDRESS$ -T CPULOAD -w 80 -c 90 -C public
}
define command{
command_name check_netapp_fan
command_line /usr/bin/perl5.8.8 $USER1$/check-netapp-ng.pl -H $HOSTADDRESS$ -T FAN -w 1 -c 3 -C public
}
and in services.cfg :
define service{
use generic-service,graphed-service
host_name machineX,machineY
service_description CPU Load
check_command check_netapp_cpu
}
define service{
use generic-service,graphed-service
host_name machineX,machineY
service_description Fan Fail
check_command check_netapp_fan
}
results :
OK: CPULOAD 1% | cpuload=1percent
OK: FAN 0 | failedfans=0
If that help someone ;)
Just to tell you that is a very good perl script !
Awesome work !
I recommended !!
For others, some informations :
(Examples of my commands.cfg)
define command{
command_name check_netapp_cpu
command_line /usr/bin/perl5.8.8 $USER1$/check-netapp-ng.pl -H $HOSTADDRESS$ -T CPULOAD -w 80 -c 90 -C public
}
define command{
command_name check_netapp_fan
command_line /usr/bin/perl5.8.8 $USER1$/check-netapp-ng.pl -H $HOSTADDRESS$ -T FAN -w 1 -c 3 -C public
}
and in services.cfg :
define service{
use generic-service,graphed-service
host_name machineX,machineY
service_description CPU Load
check_command check_netapp_cpu
}
define service{
use generic-service,graphed-service
host_name machineX,machineY
service_description Fan Fail
check_command check_netapp_fan
}
results :
OK: CPULOAD 1% | cpuload=1percent
OK: FAN 0 | failedfans=0
If that help someone ;)