License
GNU General Public License v3.0
This asset is forked from an upstream project at https://github.com/zlacelle/nagios_check_zfs_linux that does not yet exist as a Bonsai asset.
forked from https://github.com/zlacelle/nagios_check_zfs_linux
v2.0.1 · public · Published almost 4 years ago
Author: Zachary LaCelle
License: GPLv3
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.
This is a python script which will check the health, capacity,
fragmentation, and other things for use with Nagios monitoring.
It provides perfdata feedback as well.
As others report testing with different OS/ZFS versions, I will add them
in for each tag. It's likely that this plugin works with relatively
new versions of Ubuntu, CentOS, and OmniOS versions, as the Python
and ZFS functions used rarely change.
NOTE: Dropping support for Python <3.5. Effectively, this means Ubuntu <16.04.
For testing on older releases, see the CHANGELOG associated with the tag.
This assumes you've set up a separate local configuration file include directory in your nagios.cfg
to store all of your local configs. One example would be:
cfg_dir=/usr/local/nagios/etc/objects/conf.d
This also assumes you've installed check_zfs.py to your nagios/libexec directory. One way would be to
softlink it to your directory where you've checked out the Git repository.
Below is an example set of command definitions which allow various levels of fidelity in zpool querying:
#Commands to check zpool status
define command {
command_name check_zpool_full
command_line $USER1$/check_zfs.py --capacity $ARG2$ $ARG3$ --fragmentation $ARG4$ $ARG5$ $ARG1$
}
define command {
command_name check_zpool_capacity
command_line $USER1$/check_zfs.py --capacity $ARG2$ $ARG3$ $ARG1$
}
define command {
command_name check_zpool
command_line $USER1$/check_zfs.py $ARG1$
}
Below is an example service definition using the above commands which will check the pool named "storage"
for both capacity (warn at 70%, critical at 85%) and fragmentation (warn at 30%, critical at 40%):
define service {
use local-service
host_name localhost
service_description ZPOOL STORAGE
check_command check_zpool_full!storage!70!85!30!40
notifications_enabled 1
}
Below is an example NRPE configuration which will accept the single argument "check_zpool" and run the
commands:
TODO
The zfs command binaries that Check_ZFS_Linux uses to perform its checks are the following:
/sbin/zfs
/sbin/zpool
They need to be run as root in order to work properly. Since Nagios / NCPA / NRPE all typically
run as the user 'nagios', we need to give Nagios sudo access in order to run these binaries
correctly. If Check_ZFS_Linux has problems sudo'ing to root, you'll see errors along these lines:
UNKNOWN : process must be run as root. Possible solution: add the following to your visudo: nagios ALL=NOPASSWD: Context: Zpool command - retval. Original command: "['/usr/bin/sudo', '-n', '/sbin/zpool', 'list', 'rpool']", then run check script with --nosudo option.
When debugging these issues, it may prove useful to run them from the host being monitored directly.
For example, here the check_zfs.py script succeeds when run as root:
root@hydrox:~/ncpa/plugins# whoami
root
root@hydrox:~/ncpa/plugins# ./check_zfs.py rpool
OK: POOL: rpool, STATUS: ONLINE, SIZE: 928G, ALLOC: 209G, FREE: 719G, DEDUP: 1.00x, COMPRESS: 1.11x, FRAG: 5%, CAP: 22% | frag=5%;;; cap=22%;;; dedup=1.00 compress_ratio=1.11 size=928.0GB;;; alloc=209.0GB;;; free=719.0GB;;; health=0;1;3;
root@hydrox:~/ncpa/plugins#
But when run as the 'nagios' user, it fails:
root@hydrox:~/ncpa/plugins# su nagios
$ whoami
nagios
$ ./check_zfs.py rpool
UNKNOWN : process must be run as root. Possible solution: add the following to your visudo: nagios ALL=NOPASSWD: Context: Zpool command - retval. Original command: "['/usr/bin/sudo', '-n', '/sbin/zpool', 'list', 'rpool']", then run check script with --nosudo option.
$
So, how do you fix this?
As the error suggests, try adding a line to the /etc/sudoers file for nagios. Best practice typically encourages you to use the visudo utility to do this editing. Here is what it might look like afterwards:
root@hydrox:/etc/sudoers.d# pwd
/etc/sudoers.d
root@hydrox:/etc/sudoers.d# ls -l
total 9
-r--r----- 1 root root 958 Feb 1 22:41 README
-r--r----- 1 root root 696 Jul 11 18:09 zfs
root@hydrox:/etc/sudoers.d# cat zfs
nagios ALL=NOPASSWD: /sbin/zfs
nagios ALL=NOPASSWD: /sbin/zpool
After you've given the nagios user access to run the zfs commands, try to run the check_zfs script again. Hopefully it now works, but if not, try using the '--nosudo' option.
root@hydrox:~/ncpa/plugins# su nagios
$ whoami
nagios
$ ./check_zfs.py rpool --nosudo
OK: POOL: rpool, STATUS: ONLINE, SIZE: 928G, ALLOC: 209G, FREE: 719G, DEDUP: 1.00x, COMPRESS: 1.11x, FRAG: 5%, CAP: 22% | frag=5%;;; cap=22%;;; dedup=1.00 compress_ratio=1.11 size=928.0GB;;; alloc=209.0GB;;; free=719.0GB;;; health=0;1;3;
$
From commands.cfg on the Nagios host:
#
# NCPA driven remote-ZFS check
#
define command {
#
# Arg1 = token (community string)
# Arg2 = ZFS pool name
#
command_name check-zfs-on-remote-host
command_line $USER1$/check_ncpa.py -H $HOSTADDRESS$ -t $ARG1$ -M 'plugins/check_zfs.py' -a "$ARG2$ --nosudo"
}
From ncpa.cfg on the remote host being monitored:
#
# Extensions for plugins
# ----------------------
...
# Forcing move to python 3
.py = /usr/bin/python3 $plugin_name $plugin_args
Place check_zfs.py into the NCPA plugin's directory (/root/ncpa/plugins for example)
On systems with SELinux in enforcing mode nrpe is not granted the
required permissions by SELinux, for that you can compile a policy
module, then a policy package that can then be installed.
A sample can be used as follows:
If you want to unload it: semodule -i check_zfs_py
Please describe the reason for reporting this asset. Our moderators will be notified and will disable the asset if it is found to be inappropriate.
×