Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
Check hardware running VMware ESXi
Meet The New Nagios Core Services Platform
Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.
Monitoring Made Magically Better
- Nagios Core on Overdrive
- Powerful Monitoring Dashboards
- Time-Saving Configuration Wizards
- Open Source Powered Monitoring On Steroids
- And So Much More!
Running (free) VMware ESXi don't let you install your own manufacturer agents (Dell Openmanage for example). By using this python script which queries VMware CIM agent you can monitor your hardware's global health.
Pre-req : python with pywbem module
Usage : ./check_esx_wbem.py hostname user password [verbose]
Example : ./check_esx_wbem.py https://myesxi:5989 root password
Pre-req : python with pywbem module
Usage : ./check_esx_wbem.py hostname user password [verbose]
Example : ./check_esx_wbem.py https://myesxi:5989 root password
Reviews (10)
byMpke, May 8, 2014
Have you any suggestion how can I set up the plugin in a special environment where i can't use root user for monitoring.
byJP2013, October 14, 2013
Great plugin but unfortunately some servers don't seem to have harddisks in the script.
It says: Check classe VMware_StorageExtent
but don't show the disks from some of our servers (4 out of 10)
Does any one knows the solution for this?
It says: Check classe VMware_StorageExtent
but don't show the disks from some of our servers (4 out of 10)
Does any one knows the solution for this?
byCircaLucid, June 4, 2013
ESXi4.1 on a Dell PowerEdge 2900 works perfectly. Thank you!
You should check out the new version of this plugin:
http://exchange.nagios.org/directory/Plugins/Operating-Systems/%2A-Virtual-Environments/VMWare/check_esxi_hardware-2Epy/details
http://exchange.nagios.org/directory/Plugins/Operating-Systems/%2A-Virtual-Environments/VMWare/check_esxi_hardware-2Epy/details
bynicola.sarobba, January 19, 2012
The plugin works fine with ESXi 4.1 (the new licensed version).
If you do not want to use the root account, but a dedicated account for monitoring, you can try this steps:
in vSphere
- create a nagios user
- add this user to the root group
- Assign the "No access" role to the nagios user
Youll'have the right to access from the plugin, but the nagios user will not be able to access with the vsphere client, the console and, SSH.
If you do not want to use the root account, but a dedicated account for monitoring, you can try this steps:
in vSphere
- create a nagios user
- add this user to the root group
- Assign the "No access" role to the nagios user
Youll'have the right to access from the plugin, but the nagios user will not be able to access with the vsphere client, the console and, SSH.
byMajed, October 6, 2011
here is a guide to set it up working to save you some time (gentoo configuration)
1. Emerge python
2. Download pywbem-0.7.0.tar.gz from http://sourceforge.net/projects/pywbem/files/pywbem/pywbem-0.7/
3. tar –xvf pywbem-0.7.0.tar.gz
4. cd pywbem-0.7.0.tar.gz
5. python setup.py build
6. python setup.py install
7. to test pywbem : $ python
Python 2.6.6 (r266:84292, Sep 14 2011, 06:53:15)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pywbem
>>>
8. srvmon-Bellakt plugins # ./check_esx_wbem.py https://10.1.1.xxx:5989 root password
if you get the following error (i was testing on a virtual machine)
Traceback (most recent call last):
File "./check_esx_wbem.py", line 75, in
instance_list = wbemclient.EnumerateInstances(classe)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 404, in EnumerateInstances
**params)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 219, in imethodcall
raise CIMError(code, tt[1]['DESCRIPTION'])
pywbem.cim_operations.CIMError: (6, u'The requested object could not be found')
9. to remove this error comment out memory (# 'CIM_Memory',)
1. Emerge python
2. Download pywbem-0.7.0.tar.gz from http://sourceforge.net/projects/pywbem/files/pywbem/pywbem-0.7/
3. tar –xvf pywbem-0.7.0.tar.gz
4. cd pywbem-0.7.0.tar.gz
5. python setup.py build
6. python setup.py install
7. to test pywbem : $ python
Python 2.6.6 (r266:84292, Sep 14 2011, 06:53:15)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pywbem
>>>
8. srvmon-Bellakt plugins # ./check_esx_wbem.py https://10.1.1.xxx:5989 root password
if you get the following error (i was testing on a virtual machine)
Traceback (most recent call last):
File "./check_esx_wbem.py", line 75, in
instance_list = wbemclient.EnumerateInstances(classe)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 404, in EnumerateInstances
**params)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 219, in imethodcall
raise CIMError(code, tt[1]['DESCRIPTION'])
pywbem.cim_operations.CIMError: (6, u'The requested object could not be found')
9. to remove this error comment out memory (# 'CIM_Memory',)
byrnrepp, December 21, 2010
The version I am running has been through six revisions compared to this version.
Here's the change log of the version I have:
#@---------------------------------------------------
#@ History
#@---------------------------------------------------
#@ Date : 20080820
#@ Author : David Ligeret
#@ Reason : Initial release
#@---------------------------------------------------
#@ Date : 20080821
#@ Author : David Ligeret
#@ Reason : Add verbose mode
#@---------------------------------------------------
#@ Date : 20090219
#@ Author : Joshua Daniel Franklin
#@ Reason : Add try/except to catch AuthError and CIMError
#@---------------------------------------------------
#@ Date : 20100202
#@ Author : Branden Schneider
#@ Reason : Added HP Support (HealthState)
#@---------------------------------------------------
#@ Date : 20100512
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Combined different versions (Joshua and Branden)
#@ Reason : Added hardware type switch (dell or hp)
#@---------------------------------------------------
#@ Date : 20100626/28
#@ Author : Samir Ibradzic www.brastel.com
#@ Reason : Added basic server info
#@ Reason : Wanted to have server name, serial number & bios version at output
#@ Reason : Set default return status to Unknown
#@---------------------------------------------------
#@ Date : 20100702
#@ Author : Aaron Rogers www.cloudmark.com
#@ Reason : GlobalStatus was incorrectly getting (re)set to OK with every CIM element check
#@---------------------------------------------------
#@ Date : 20100705
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Due to change 20100702 all Dell servers would return UNKNOWN instead of OK...
#@ Reason : ... so added Aaron's logic at the end of the Dell checks as well
Here's the change log of the version I have:
#@---------------------------------------------------
#@ History
#@---------------------------------------------------
#@ Date : 20080820
#@ Author : David Ligeret
#@ Reason : Initial release
#@---------------------------------------------------
#@ Date : 20080821
#@ Author : David Ligeret
#@ Reason : Add verbose mode
#@---------------------------------------------------
#@ Date : 20090219
#@ Author : Joshua Daniel Franklin
#@ Reason : Add try/except to catch AuthError and CIMError
#@---------------------------------------------------
#@ Date : 20100202
#@ Author : Branden Schneider
#@ Reason : Added HP Support (HealthState)
#@---------------------------------------------------
#@ Date : 20100512
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Combined different versions (Joshua and Branden)
#@ Reason : Added hardware type switch (dell or hp)
#@---------------------------------------------------
#@ Date : 20100626/28
#@ Author : Samir Ibradzic www.brastel.com
#@ Reason : Added basic server info
#@ Reason : Wanted to have server name, serial number & bios version at output
#@ Reason : Set default return status to Unknown
#@---------------------------------------------------
#@ Date : 20100702
#@ Author : Aaron Rogers www.cloudmark.com
#@ Reason : GlobalStatus was incorrectly getting (re)set to OK with every CIM element check
#@---------------------------------------------------
#@ Date : 20100705
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Due to change 20100702 all Dell servers would return UNKNOWN instead of OK...
#@ Reason : ... so added Aaron's logic at the end of the Dell checks as well
bysuesst, May 25, 2010
Hi,
great plugin, the only thing I recognized is, that it detect disk errors(as an experiment, I pulled out a disk out of raid 5), but the status is not set to warning or critical.
Here is the verbose output:
20100525 16:49:12 Check classe CIM_ComputerSystem
20100525 16:49:12 Element Name = System Board 7:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:2
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:5
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = esx-test01.roland-domaene.intra
20100525 16:49:12 Element Name = Hardware Management Controller (Node 0)
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:12 Check classe CIM_NumericSensor
20100525 16:49:12 Element Name = System Board 1 Power Meter
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Power Domain 1 Temp 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 11 Temp 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 10 Temp 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = External Environment 9 Temp 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 5 Temp 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 4 Fan 6
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 3 Fan 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 2 Fan 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 1 Fan 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 2 Fan 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 1 Fan 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Check classe CIM_Memory
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-3 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Memory
20100525 16:49:12 Check classe CIM_Processor
20100525 16:49:13 Element Name = Proc 1
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe CIM_RecordLog
20100525 16:49:13 Element Name = IPMI SEL
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe OMC_DiscreteSensor
20100525 16:49:13 Element Name = System Board 1 Fans
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Processor Module 1 VRM 1
20100525 16:49:13 Element Name = Power Supply 3 Power Supplies
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Presence detected
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Presence detected
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = System Chassis 3 Ext. Health LED
20100525 16:49:13 Element Name = System Chassis 2 Int. Health LED
20100525 16:49:13 Element Name = System Chassis 1 UID Light
20100525 16:49:13 Check classe VMware_StorageExtent
20100525 16:49:13 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 6 : 136GB : Data Disk : Disk Error
20100525 16:49:13 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 5 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 3 on HPSA1 : Port 2I Box 1 Bay 3 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 4 on HPSA1 : Port 2I Box 1 Bay 2 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 5 on HPSA1 : Port 2I Box 1 Bay 1 : 136GB : Data Disk
20100525 16:49:13 Check classe VMware_Controller
20100525 16:49:13 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:13 Check classe VMware_StorageVolume
20100525 16:49:13 Element Name = Logical Volume 1 on HPSA1 : RAID 5 : 546GB : Disk 1,2,3,4,5 : Interim Recovery
20100525 16:49:13 Check classe VMware_Battery
20100525 16:49:14 Element Name = Battery on HPSA1
20100525 16:49:14 Check classe VMware_SASSATAPort
OK
great plugin, the only thing I recognized is, that it detect disk errors(as an experiment, I pulled out a disk out of raid 5), but the status is not set to warning or critical.
Here is the verbose output:
20100525 16:49:12 Check classe CIM_ComputerSystem
20100525 16:49:12 Element Name = System Board 7:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:2
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:5
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = esx-test01.roland-domaene.intra
20100525 16:49:12 Element Name = Hardware Management Controller (Node 0)
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:12 Check classe CIM_NumericSensor
20100525 16:49:12 Element Name = System Board 1 Power Meter
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Power Domain 1 Temp 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 11 Temp 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 10 Temp 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = External Environment 9 Temp 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 5 Temp 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 4 Fan 6
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 3 Fan 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 2 Fan 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 1 Fan 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 2 Fan 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 1 Fan 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Check classe CIM_Memory
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-3 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Memory
20100525 16:49:12 Check classe CIM_Processor
20100525 16:49:13 Element Name = Proc 1
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe CIM_RecordLog
20100525 16:49:13 Element Name = IPMI SEL
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe OMC_DiscreteSensor
20100525 16:49:13 Element Name = System Board 1 Fans
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Processor Module 1 VRM 1
20100525 16:49:13 Element Name = Power Supply 3 Power Supplies
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Presence detected
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Presence detected
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = System Chassis 3 Ext. Health LED
20100525 16:49:13 Element Name = System Chassis 2 Int. Health LED
20100525 16:49:13 Element Name = System Chassis 1 UID Light
20100525 16:49:13 Check classe VMware_StorageExtent
20100525 16:49:13 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 6 : 136GB : Data Disk : Disk Error
20100525 16:49:13 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 5 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 3 on HPSA1 : Port 2I Box 1 Bay 3 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 4 on HPSA1 : Port 2I Box 1 Bay 2 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 5 on HPSA1 : Port 2I Box 1 Bay 1 : 136GB : Data Disk
20100525 16:49:13 Check classe VMware_Controller
20100525 16:49:13 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:13 Check classe VMware_StorageVolume
20100525 16:49:13 Element Name = Logical Volume 1 on HPSA1 : RAID 5 : 546GB : Disk 1,2,3,4,5 : Interim Recovery
20100525 16:49:13 Check classe VMware_Battery
20100525 16:49:14 Element Name = Battery on HPSA1
20100525 16:49:14 Check classe VMware_SASSATAPort
OK
Works ok but output isn't always the best. Needs a little re-write to generate better info. As a basic check tool it works ok.