This post is about using the versatile MRTG tool visualize usage data and health status of the storage device.
Prerequisites
- Web server to see the plots over http. For this the built-in mini_httpd will do, but lighthttpd is good as well if installed already. I however advice to use the first if there is no other reason for using the second one so that the configuration of the server does not have to be altered.
- Ipkg packages: cron, net-snmp and mrtg (also installs libgd and perl). The first is for periodic execution of the data acquisition, while net-snmp will collect the data which will be visualized by mrtg. Luckily again, these packages work out of the box, no compilation is required. For the usage of ipkg see my post on the basic setup.
- For cron and net-snmp to start on device startup, copy the contents of /opt/etc/init.d to /etc/init.d.
Setup MRTG
This can be done by using the built-in tool cfgmaker, for which the usage reference can be found here. The basic setup includes measuring the throughput of the ethernet interface:
#~: cfgmaker --global 'WorkDir: /proto/SxM_webui/mrtg' --global 'Options[_]: bits,growright' --output /opt/etc/mrtg.cfg public@localhost
which sets up WorkDir where the graphic output and html files will be stored to be in the working directory of the html server. The config file will be stored at /opt/etc/mrtg.cfg.
I realized however that usage of the cfgmaker tool is not displayed in case of an input error or issuing --help. It seems that the /opt/lib/mrtg2/Pod/Usage.pm file shipped with the package is broken and has to be replaced with the one found here.
This basic config can be tested by issuing:
#~: mrtg /opt/etc/mrtg.cfg
at least twice and checking the result in the browser at http://IP_of_storage/mrtg/localhost_2.html (localhost_1 would be the loopback interface which is not plotted). The data can be collected automatically by adding the following line to /opt/etc/crontab:
*/5 * * * * root /opt/bin/mrtg /opt/etc/mrtg.cfgand restarting the cron daemon:
#~: /etc/init.d/S10cron
This tells cron to run mrtg every 5 minutes. According to the mrtg manual it is not recommended to make it more frequent, and this value is usually good choice anyway.
Eventually, graphs will be filled with data like this:
This is good, but we can go even further: CPU load, memory and hard disk usage and temperature data can be plotted as well. The first three data values can be collected by net-snmp, while for the last one we need to modify the script that I have posted already:
#!/bin/sh
Y1=`cat /sys/module/thermAndFan/parameters/current_temp `
C1='display(0);config("tilde","off");3865/(ln('
C2=')+9.1411)-273.16'
C2=')+9.1411)-273.16'
/usr/sbin/smartctl -a -d ata /dev/sda | awk '/Temperature/ {print $10}'
/opt/bin/calc $C1$Y1$C2 | awk 'NR==3 {print $1}'
echo 0
echo Temperature Data on Storage
exit 0
/opt/bin/calc $C1$Y1$C2 | awk 'NR==3 {print $1}'
echo 0
echo Temperature Data on Storage
exit 0
The output of this script will consist of four lines that are needed by mrtg:
- "Input bytes" data: Here it will be the temperature of the hard disk;
- "Output bytes" data: Here it will be the temperature of the system thermometer;
- Uptime: we don't need this so set it to zero.
- Text output: the name of the plots.
I saved this script to /opt/etc/mrtg/tempscript1.sh which we will need in the revised mrtg.cfg:
# /opt/bin/cfgmaker --global "WorkDir: /proto/SxM_webui/mrtg" --global "Options[_]: bits,growright" --output /opt/etc/mrtg.cfg public@localhost
EnableIPv6: no
WorkDir: /proto/SxM_webui/mrtg
Options[_]: bits,growright
LoadMIBs: /opt/share/snmp/mibs/HOST-RESOURCES-MIB.txt
kilo[_]: 1024
XSize[_]: 600
RouterUpTime[_]: hrSystemUptime.0:public@localhost
### Interface 2 >> Descr: 'eth0' | Name: 'eth0' | Ip: '192.168.1.1' | Eth: '' ###
Target[localhost_2]: 2:public@localhost:
SetEnv[localhost_2]: MRTG_INT_IP="192.168.1.1" MRTG_INT_DESCR="eth0"
MaxBytes[localhost_2]: 12500000
Options[localhost_2]: nobanner,noborder,growright,transparent
Title[localhost_2]: Traffic Analysis for eth0
### CPU
Target[localhost.cpu]: hrProcessorLoad.768&hrProcessorLoad.768:public@localhost
Title[localhost.cpu]: CPU Load
PageTop[localhost.cpu]:Active CPU load
MaxBytes[localhost.cpu]: 100
Unscaled[localhost.cpu]: ymwdOptions[localhost.cpu]: growright,gauge,nopercent,noo,transparent,nolegend,nobanner,noborder
YLegend[localhost.cpu]: %
ShortLegend[localhost.cpu]: %
LegendI[localhost.cpu]: Processor Load
### Temperature
Target[localhost.temp]: `/opt/etc/mrtg/tempscript1.sh`
Title[localhost.temp]: Temperature Reading
PageTop[localhost.temp]: Temperature reading
MaxBytes[localhost.temp]: 75
Unscaled[localhost.temp]: ymwdOptions[localhost.temp]: growright,gauge,nopercent,transparent,nolegend,nobanner,noborder
YLegend[localhost.temp]: Celsius
ShortLegend[localhost.temp]: C
LegendI[localhost.temp]: HDD Temperature
LegendO[localhost.temp]: Sys Temperature
### Memory
Target[localhost.mem]: hrStorageUsed.1&hrStorageUsed.1:public@localhost * 1024
Title[localhost.mem]: Memory Consumption
PageTop[localhost.mem]:Memory Consumption
Options[localhost.mem]: growright,gauge,nopercent,noo,transparent,nolegend,nobanner,noborder
MaxBytes[localhost.mem]: 129871873LegendI[localhost.mem]: Used Memory
ShortLegend[localhost.mem]: Bytes
Unscaled[localhost.mem]: ymwd
YLegend[localhost.mem]: Bytes
Kilo[localhost.mem]: 1024
### Hard drive: /
Target[localhost.hd1]: hrStorageUsed.31&hrStorageUsed.31:public@localhost * 4096
Title[localhost.hd1]: Disk Usage: /
PageTop[localhost.hd1]:Disk Usage: /
Options[localhost.hd1]: growright,gauge,nopercent,noo,transparent,nolegend,nobanner,noborder
MaxBytes[localhost.hd1]: 100000000000000000LegendI[localhost.hd1]: Storage Used
### LegendO[localhost.hd1]: Total
Kilo[localhost.hd1]: 1024
YLegend[localhost.hd1]: Used Bytes
ShortLegend[localhost.hd1]: B
There are some important notes however:
- Hard disk usage is normalized against the 1TB version of the MBWE. The corresponding MaxBytes property has to be modified if this is to be used with a hard disk of an other size.
- The global RouterUpTime property is modified so that it indeed reflects the uptime and not just the uptime of the net-snmp daemon.
- Memory usage is set to plot against the maximum of 128 MB for the white light edition. For the blue ring edition 64MB is to be used.
- If the script for temperature reading is put somewhere else, the corresponding Target line has to be altered as well.