Reaching out to others! Free & Open Source Software, Kannada, L10n, L18n Data Science, Cloud Computing & more…

Adaptec RAID Monitoring via Nagios

Monitoring servers with RAID controllers is made easy through and other monitoring systems. Today its quite easy to get an app installed on your mobile and configure it to display critical errors from to quickly act on. When you’re an in-charge of Infrastructure, monitoring RAID becomes very very critical.  While digging around simple ways to monitor cards, a tiny little piece of script found on exchange –
check-aacraid.py by Anchor Systems.

This script works with the Storage Manager – arcconf installed to manage RAID Cards.

Here is an excerpt from Nagios Exchange on check-aacraid script configuration for your quick reference :-

Check the health of an Adaptec raid controller using /usr/StorMan/arcconf Checks the following: Logical device status, Controller status, Failed & Degraded drives. If the battery is present: Charging status, Est of charge time left, Charge left %. And removes the log file “UcliEvt.log” that is dropped into the CWD when /usr/StorMan/arcconf is run.
Check the health of an Adaptec raid controller using /usr/StorMan/arcconf

Checks the following:
Logical device status
Controller status
Failed & Degraded drives

If battery present:
Charging status
Est of charge time left
Charge left %

And removes the log file “UcliEvt.log” that is dropped into the CWD when /usr/StorMan/arcconf is run.

Add this to your “/etc/sudoers” file using visudo
"nagios ALL=(root) NOPASSWD: /usr/StorMan/arcconf GETCONFIG 1 *"

## On RHEL & possibly others ##
Disable “Defaults requiretty” in /etc/sudoers otherwise the command will not run via NRPE.

Add this to your checkcommands.cfg

define command {
command_name check_aacraid
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_aacraid
}

Add this to your servicedefs.cfg

define service {
use low-service-level
name aacraid-service
service_description aacraid
check_command check_aacraid
register 0
notification_interval 3600
}

Add the service

define service {
use aacraid-service
host_name host-with-crap-adaptec-crud
contact_groups upset-admin
}

And on the host you will be checking add this to nrpe.cfg
command[check_aacraid]=/usr/local/sbin/check-aacraid.py

Tags: , , ,