19.2.10

WD MyBook Hacks - MRTG

This post is about using the versatile MRTG tool visualize usage data and health status of the storage device.

Prerequisites
  • Web server to see the plots over http. For this the built-in mini_httpd will do, but lighthttpd is good as well if installed already. I however advice to use the first if there is no other reason for using the second one so that the configuration of the server does not have to be altered.
  • Ipkg packages: cron, net-snmp and mrtg (also installs libgd and perl). The first is for periodic execution of the data acquisition, while net-snmp will collect the data which will be visualized by mrtg. Luckily again, these packages work out of the box, no compilation is required. For the usage of ipkg see my post on the basic setup.
  • For cron and net-snmp to start on device startup, copy the contents of /opt/etc/init.d to /etc/init.d.
Setup MRTG

This can be done by using the built-in tool cfgmaker, for which the usage reference can be found here. The basic setup includes measuring the throughput of the ethernet interface:

#~: cfgmaker --global 'WorkDir: /proto/SxM_webui/mrtg' --global 'Options[_]: bits,growright' --output /opt/etc/mrtg.cfg public@localhost 

which sets up WorkDir where the graphic output and html files will be stored to be in the working directory of the html server. The config file will be stored at /opt/etc/mrtg.cfg.
I realized however that usage of the cfgmaker tool is not displayed in case of an input error or issuing --help. It seems that the /opt/lib/mrtg2/Pod/Usage.pm file shipped with the package is broken and has to be replaced with the one found here.
This basic config can be tested by issuing:
#~: mrtg /opt/etc/mrtg.cfg
at least twice and checking the result in the browser at http://IP_of_storage/mrtg/localhost_2.html (localhost_1 would be the loopback interface which is not plotted). The data can be collected automatically by adding the following line to /opt/etc/crontab:
*/5 * * * * root /opt/bin/mrtg /opt/etc/mrtg.cfg
and restarting the cron daemon:
#~: /etc/init.d/S10cron
This tells cron to run mrtg every 5 minutes. According to the mrtg manual it is not recommended to make it more frequent, and this value is usually good choice anyway.
Eventually, graphs will be filled with data like this:

This is good, but we can go even further: CPU load, memory and hard disk usage and temperature data can be plotted as well. The first three data values can be collected by net-snmp, while for the last one we need to modify the script that I have posted already:

#!/bin/sh
Y1=`cat /sys/module/thermAndFan/parameters/current_temp `
C1='display(0);config("tilde","off");3865/(ln('
C2=')+9.1411)-273.16'
/usr/sbin/smartctl -a -d ata /dev/sda | awk '/Temperature/ {print $10}'
/opt/bin/calc $C1$Y1$C2 | awk 'NR==3 {print $1}'
echo 0
echo Temperature Data on Storage
exit 0

The output of this script will consist of four lines that are needed by mrtg:
  1.  "Input bytes" data: Here it will be the temperature of the hard disk;
  2. "Output bytes" data: Here it will be the temperature of the system thermometer;
  3. Uptime: we don't need this so set it to zero.
  4. Text output: the name of the plots.
 I saved this script to /opt/etc/mrtg/tempscript1.sh which we will need in the revised mrtg.cfg:

# Created by
# /opt/bin/cfgmaker --global "WorkDir: /proto/SxM_webui/mrtg" --global "Options[_]: bits,growright" --output /opt/etc/mrtg.cfg public@localhost

EnableIPv6: no
WorkDir: /proto/SxM_webui/mrtg
Options[_]: bits,growright

LoadMIBs: /opt/share/snmp/mibs/HOST-RESOURCES-MIB.txt

kilo[_]: 1024
XSize[_]: 600
RouterUpTime[_]: hrSystemUptime.0:public@localhost

### Interface 2 >> Descr: 'eth0' | Name: 'eth0' | Ip: '192.168.1.1' | Eth: '' ###

Target[localhost_2]: 2:public@localhost:
SetEnv[localhost_2]: MRTG_INT_IP="
192.168.1.1" MRTG_INT_DESCR="eth0"
MaxBytes[localhost_2]: 12500000
Options[localhost_2]: nobanner,noborder,growright,transparent
Title[localhost_2]: Traffic Analysis for eth0

 
### CPU

Target[localhost.cpu]: hrProcessorLoad.768&hrProcessorLoad.768:public@localhost
Title[localhost.cpu]: CPU Load
PageTop[localhost.cpu]:Active CPU load
 
MaxBytes[localhost.cpu]: 100
Unscaled[localhost.cpu]: ymwd
Options[localhost.cpu]: growright,gauge,nopercent,noo,transparent,nolegend,nobanner,noborder
YLegend[localhost.cpu]: %
ShortLegend[localhost.cpu]: %
LegendI[localhost.cpu]: Processor Load

### Temperature

Target[localhost.temp]: `/opt/etc/mrtg/tempscript1.sh`
Title[localhost.temp]: Temperature Reading
PageTop[localhost.temp]: Temperature reading
 
MaxBytes[localhost.temp]: 75
Unscaled[localhost.temp]: ymwd
Options[localhost.temp]: growright,gauge,nopercent,transparent,nolegend,nobanner,noborder
YLegend[localhost.temp]: Celsius
ShortLegend[localhost.temp]: C
LegendI[localhost.temp]: HDD Temperature
LegendO[localhost.temp]: Sys Temperature

### Memory

Target[localhost.mem]: hrStorageUsed.1&hrStorageUsed.1:public@localhost * 1024
Title[localhost.mem]: Memory Consumption
PageTop[localhost.mem]:Memory Consumption

Options[localhost.mem]: growright,gauge,nopercent,noo,transparent,nolegend,nobanner,noborder
MaxBytes[localhost.mem]: 129871873
LegendI[localhost.mem]: Used Memory
ShortLegend[localhost.mem]: Bytes
Unscaled[localhost.mem]: ymwd
YLegend[localhost.mem]: Bytes
Kilo[localhost.mem]: 1024

### Hard drive: /
Target[localhost.hd1]: hrStorageUsed.31&hrStorageUsed.31:public@localhost * 4096
Title[localhost.hd1]: Disk Usage: / 

PageTop[localhost.hd1]:Disk Usage: /
Options[localhost.hd1]: growright,gauge,nopercent,noo,transparent,nolegend,nobanner,noborder
MaxBytes[localhost.hd1]: 100000000000000000
LegendI[localhost.hd1]: Storage Used
### LegendO[localhost.hd1]: Total
Kilo[localhost.hd1]: 1024
YLegend[localhost.hd1]: Used Bytes
ShortLegend[localhost.hd1]: B

       
There are some important notes however:
  • Hard disk usage is normalized against the 1TB version of the MBWE. The corresponding MaxBytes property  has to be modified if this is to be used with a hard disk of an other size.
  • The global RouterUpTime property is modified so that it indeed reflects the uptime and not just the uptime of the net-snmp daemon.
  • Memory usage is set to plot against the maximum of 128 MB for the white light edition. For the blue ring edition 64MB is to be used.
  • If the script for temperature reading is put somewhere else, the corresponding Target line has to be altered as well.
In case everything is OK, which can be tested by manually issuing ´mrtg /opt/etc/mrtg.cfg´ in the shell, the other graphs are filled with data as well:

14.2.10

WD MyBook Hacks - Temperature Readings

This is a topic that puzzled me for some time. Built-in temperature sensors are useful for monitoring the health status of the device and making proper action if for instance overheating occurs. For personal computing, lm-sensors is a frequently-used tool for this purpose which is able to read the sensors via SMBus on most of the motherboards. However, the MyBook has a different facility for this, so we will see how to take advantage of it. In fact, not only the on-board temperature sensor can be read, but the one on the hard drive can be used as well.

Reading sensor on the hard drive

This sensor can simply be reached through reading the S.M.A.R.T. data of the drive as I posted earlier. What we need to do is to grep the useful data and get rid of the garbage by selecting the line and the field we are interested in:

#~: smartctl -a -d ata /dev/sda | awk '/Temperature/ {print $10}'
44

will do the job, and we get the temperature of the hard drive in Celsius.

Reading the on-board temperature sensor

This one can be interrogated through:

#~: cat /sys/module/thermAndFan/parameters/current_temp
27

which is a reasonable value. If however we look further into the details:

#~: cat /sys/module/thermAndFan/parameters/hot_limit
16
#~: cat /sys/module/thermAndFan/parameters/cold_limit
104

this should give the clue that this is not a direct temperature reading. Indeed this sensor does not read in Celsius but rather in "Counts" which is proportional to the sensor resistance value. Luckily enough google finds the lookup table for this platform here (see TvsCnt[] array). As the sensor is a semiconductor thermistor, there is no simple linear equation to determine the temperature, rather the following exponential dependence should be used: Cnt=a×exp(b/T), with Cnt being the reading above and T being the absolute temperature in Kelvin. For this specific sensor the parameters are a=1/9331 and b=3865 as a result by a simple fitting procedure. So after some algebra we find that the temperature will be: t=3865/(ln(Cnt)+9.1411)-273.16 in Celsius (feel free to do it in Fahrenheit :) ).

In order to implement this equation, we will use the command-line calc tool:
#~: ipkg install calc

Finally using this tool a simple script can be used to determine the temperature:

#!/bin/sh
Y1=`cat /sys/module/thermAndFan/parameters/current_temp `
C1='display(0);config("tilde","off");3865/(ln('
C2=')+9.1411)-273.16'
/opt/bin/calc $C1$Y1$C2 | awk 'NR==3 {print $1}'  

which results in a single return value, e. g. 37.

Finally the question remains: Why would we do this? The reason is that these readings can be plotted using for instance MRTG, so that the health status of the storage can be checked easily.