Hd-monitoring-url

From MyWiki
Jump to: navigation, search

Hard disc monitoring

Well on the servers we have much fun trying to detect hard discs before they die since it's HP Proliant servers with HP RAID controllers, which are not necessarily used as RAID controllers, but hard discs are connected to them.

array-info

sudo array-info -d /dev/cciss/c0d0
HP Smart Array E200i
        Firmware revision : 1.80
        Rom revision      : 1.80
        3  logical drives configured.

        Logical drive  2 :
          Fault tolerance : RAID 0 (Stripping)
          Size            : 68.33 GiB (143305920 blocks of 512 bytes)
          Status          : Logical drive is ok


hpsa

hpsa.txt

cciss

cciss

cpqarrayd

CpqArrayd(1)                                                    Compaq Arraycontoller monitoringtools                                                    CpqArrayd(1)

NAME
       cpqarrayd - Arraycontoller monitoringdaemon

SYNTAX
       cpqarrayd [-d] [-v] [-t host [-t host]]

DESCRIPTION
       This  program  is  capable of monitoring HP (Compaq) array controllers. Both the devices controller by the cpqarray driver and devices controlled by the cciss
       driver are supported. The program monitors the health status for each logical volume and reports if it changes. When used on a cciss controller it reports all
       events on the controller chain.

       The default is to log to the syslog facility, but the program has an option to send traps to a monitoring system.

cciss_vol_status free tool

cciss_vol_status /dev/cciss/c*d0

hpacucli non free tool

HP RAID monitoring

closed source HP utils

hpacucli ctrl all diag file=report.zip ris=on xml=on zip=on
#!/bin/bash
CON=`hpacucli ctrl slot=0 show | grep -E "(Controller Status|Battery/Capacitor Status)"`
LD=`hpacucli ctrl slot=0 logicaldrive all show status`
PD=`hpacucli ctrl slot=0 pd all show status`
RAID=`hpacucli ctrl all show config detail | grep -A3 Array: | grep Status`
ERROR_COUNT='0'
read_stat() {
while read line
do
        if [ -n "$line" ]
        then
                RESULT=${line:${#line} - 2}
                if [ ${RESULT} = "OK" ]
                then
                        echo "Test result is OK" > /dev/null
                else
                        #echo "Test result is FAIL"
                        ERROR_STR="${line} \n\r ${ERROR_STR}"
                        let "ERROR_COUNT += 1"
                fi
        fi
done <<< "$1"
}
 
read_stat "${CON}"
read_stat "${LD}"
read_stat "${PD}"
read_stat "${RAID}"
 
if [[ -n $1 ]] && [ $1 == 'log' ]
then
        echo ${ERROR_STR}
else
        echo ${ERROR_COUNT}
fi
exit 0

smartmon tools (works only on HP disks???)

sudo smartctl -i --device=cciss,1 /dev/cciss/c0d1
sudo smartctl -t long --device=cciss,1 /dev/cciss/c0d1
sudo smartctl --device=cciss,1 -a /dev/cciss/c0d1
#!/bin/bash
for port in {0..7}
do
        smartctl -a -d cciss,$port /dev/cciss/c0d$port
done

Smartmon supported RAID controllers

HP/Compaq SmartArray series

ATA behind SAS