Atom feed of this document
  
 

 Detect failed drives

It has been our experience that when a drive is about to fail, error messages will spew into /var/log/kern.log. There is a script called swift-drive-audit that can be run via cron to watch for bad drives. If errors are detected, it will unmount the bad drive, so that Object Storage can work around it. The script takes a configuration file with the following settings:

Table 5.1. Description of configuration options for [drive-audit] in drive-audit.conf-sample
Configuration option = Default value Description
device_dir = /srv/nodeDirectory devices are mounted under
log_facility = LOG_LOCAL0Syslog log facility
log_level = INFOLogging level
log_address = /dev/logLocation where syslog sends the logs to
minutes = 60Number of minutes to look back in `/var/log/kern.log`
error_limit = 1Number of errors to find before a device is unmounted
log_file_pattern = /var/log/kern*Location of the log file with globbing pattern to check against device errors locate device blocks with errors in the log file
regex_pattern_1 = \berror\b.*\b(dm-[0-9]{1,2}\d?)\bNo help text available for this option.

This script has only been tested on Ubuntu 10.04, so if you are using a different distro or OS, some care should be taken before using in production.

Questions? Discuss on ask.openstack.org
Found an error? Report a bug against this page

loading table of contents...