Search Exchange
Search All Sites
Nagios Live Webinars
Let our experts show you how Nagios can help your organization.Login
Directory Tree
Check any zombie processes over machine
1.0.0
2017-01-11
- Nagios 2.x
- Nagios 3.x
- Nagios 4.x
- Nagios XI
11225
Meet The New Nagios Core Services Platform
Built on over 25 years of monitoring experience, the Nagios Core Services Platform provides insightful monitoring dashboards, time-saving monitoring wizards, and unmatched ease of use. Use it for free indefinitely.
Monitoring Made Magically Better
- Nagios Core on Overdrive
- Powerful Monitoring Dashboards
- Time-Saving Configuration Wizards
- Open Source Powered Monitoring On Steroids
- And So Much More!
Recieves arguments:
$1 = process name to check
$2 = Warning level
$3= Critical level
Client Side:
First thing, Go to Nagios exchange https://exchange.nagios.org/ (like a GitHub community) and check for the specific action you want, it can save time instead of writing some shell script from scratch. If it's there, excellent if it's not, Generate the script
You should test the script and see it works locally on the machine first, please comply with Nagios exit codes and output conventions, can be found here:
https://geekpeek.net/nagios-plugin-bash/
After the script is working, by the convention - important because if not done properly the server will not know how to parse the check output.
Now after you finalized the script, go to this file within the client:
vi/etc/nagios/nrpe.cfg
And add the path to your command, usually the path convention for customized scripts is this:
/usr/local/nagios/libexec/check_chrome_proc
Add your command bellow the other command inside the bottom of the file, like this:
command[check_chrome_proc]=/usr/local/nagios/libexec/check_chrome_proc /opt/google/chrome 5 20
(This script gets arguments if yours not, first of all it should, because it should get from outside the warning and critical level, like Nagios convention says)
After updating the nrpe.cfg you should restart the service:
/etc/init.d/nagios-nrpe-server restart
Server Side:
Now you should update the file:
/Server/etc/objects/commands.cfg
Add your script as command:
Change command_name of check_chrome_proc into your script name.
Change command_line into the correct path of the script.
For instance:
#check chrome zombie processes (exists over 30 minutes) within machines process list
# Receives arguments $1 = process name to check $2 = Warning level $3= Critical level
define command{
command_name check_chrome_proc
command_line $USER1$/check_chrome_proc /opt/google/chrome 10 30
}
Then go to the machine's file: mydtbld0178.cfg
And add the relevant command, for instance:
Service description is the string the nagios UI should display.
check_command check_nrpe!check_chrome_proc - Bolded is the part you should set your script instead.
# Check chrome zombie processes (exists over 30 minutes) within machines process list
define service{
use generic-service-without-notify
host_name mydtbld0178.hpeswlab.net
service_description Open Chrome browsers
check_command check_nrpe!check_chrome_proc
max_check_attempts 5
check_interval 10
retry_interval 5
}
Finally don't forget to pull from Git and reset the Nagios process.
Only Server side:
If your script output is done correctly, the only thing you need to do is going to this file:
Server/etc/nagiosgraph/map
Add a new role, catching the correct number by regex in Pearl lang:
If you need more guidance here there are instructions at the top of this file.
Also if Nagios is not representing the graph, there's something wrong with your regex.
A good way to monitor and understand where the problem is, go by SSH to the machine and run docker logs over the container:
docker logs --tail=100 nagios_nga
An instance for a regex role:
# Service type: Open Chrome browsers
# output: Open Chrome browsers CRITICAL - 430 zombies: /opt/google/chrome processes
#/output:Open Chrome browsers.*?(d+)zombies:.*processes/
/output:Open Chrome browsers.*?(d+).*?zombies:.*?processes/
and push @s, [ 'openChromeBrowsers',
[ 'openChromeBrowsers', GAUGE, $1 ] ];
More information can be found within the link
First thing, Go to Nagios exchange https://exchange.nagios.org/ (like a GitHub community) and check for the specific action you want, it can save time instead of writing some shell script from scratch. If it's there, excellent if it's not, Generate the script
You should test the script and see it works locally on the machine first, please comply with Nagios exit codes and output conventions, can be found here:
https://geekpeek.net/nagios-plugin-bash/
After the script is working, by the convention - important because if not done properly the server will not know how to parse the check output.
Now after you finalized the script, go to this file within the client:
vi/etc/nagios/nrpe.cfg
And add the path to your command, usually the path convention for customized scripts is this:
/usr/local/nagios/libexec/check_chrome_proc
Add your command bellow the other command inside the bottom of the file, like this:
command[check_chrome_proc]=/usr/local/nagios/libexec/check_chrome_proc /opt/google/chrome 5 20
(This script gets arguments if yours not, first of all it should, because it should get from outside the warning and critical level, like Nagios convention says)
After updating the nrpe.cfg you should restart the service:
/etc/init.d/nagios-nrpe-server restart
Server Side:
Now you should update the file:
/Server/etc/objects/commands.cfg
Add your script as command:
Change command_name of check_chrome_proc into your script name.
Change command_line into the correct path of the script.
For instance:
#check chrome zombie processes (exists over 30 minutes) within machines process list
# Receives arguments $1 = process name to check $2 = Warning level $3= Critical level
define command{
command_name check_chrome_proc
command_line $USER1$/check_chrome_proc /opt/google/chrome 10 30
}
Then go to the machine's file: mydtbld0178.cfg
And add the relevant command, for instance:
Service description is the string the nagios UI should display.
check_command check_nrpe!check_chrome_proc - Bolded is the part you should set your script instead.
# Check chrome zombie processes (exists over 30 minutes) within machines process list
define service{
use generic-service-without-notify
host_name mydtbld0178.hpeswlab.net
service_description Open Chrome browsers
check_command check_nrpe!check_chrome_proc
max_check_attempts 5
check_interval 10
retry_interval 5
}
Finally don't forget to pull from Git and reset the Nagios process.
Only Server side:
If your script output is done correctly, the only thing you need to do is going to this file:
Server/etc/nagiosgraph/map
Add a new role, catching the correct number by regex in Pearl lang:
If you need more guidance here there are instructions at the top of this file.
Also if Nagios is not representing the graph, there's something wrong with your regex.
A good way to monitor and understand where the problem is, go by SSH to the machine and run docker logs over the container:
docker logs --tail=100 nagios_nga
An instance for a regex role:
# Service type: Open Chrome browsers
# output: Open Chrome browsers CRITICAL - 430 zombies: /opt/google/chrome processes
#/output:Open Chrome browsers.*?(d+)zombies:.*processes/
/output:Open Chrome browsers.*?(d+).*?zombies:.*?processes/
and push @s, [ 'openChromeBrowsers',
[ 'openChromeBrowsers', GAUGE, $1 ] ];
More information can be found within the link
Reviews (0)
Be the first to review this listing!