Scrub your zfs file systems regularly

One of many great features introduced with Solaris 10 zfs is probably one of the greatest. Once you have learned to use it you will be thinking why it took so long for someone to figure it out.

zfs provides you with end-to-end checksums of all data stored in the filesystem. It also provides you with a command to verify that all the data matches the checksum, you run zpool with the scrub argument.

You may ask why you need to run scrub. If you don’t run scrub regularly you may not detect that your disks are slowly turning bad until it is too late. The problem is that by default there is no automatic tool provided that will run a scrub for you every now and then.

Below is a simple script I have written that will automatically run scrub for you on all pool available on the system. The great thing is that you can just drop this script on the server, put in cron to run regularly (I run it every Monday morning at 1am) and if it finds a problem you will get an email.

#
# this script will go through all pools and scrub them one at a time
#
MAILRECIPIENT=nickus@aspiringsysadmin.com

ZPOOL=/usr/sbin/zpool
TMPFILE=/tmp/scrub.sh.$$.$RANDOM

scrub_in_progress() {
	if $ZPOOL status $1 | grep "scrub in progress" >/dev/null; then
		return 0
	else
		return 1
	fi
}

$ZPOOL list | sed 1d | cut -d' ' -f1 | while read pool; do
	$ZPOOL scrub $pool

	while sleep 60; do
		if ! scrub_in_progress $pool; then
			break
		fi
	done

	if ! $ZPOOL status $pool | grep "with 0 errors" >/dev/null; then
		$ZPOOL status $pool >>$TMPFILE
	fi
done

if [ -s $TMPFILE ]; then
	cat $TMPFILE | mailx -s "zpool scrub on `hostname` generated errors" $MAILRECIPIENT
fi

rm -f $TMPFILE

Have a look at the man page for zpool(1M) for additional information.

[?]
Do you need system administration assistance? If you like what you are reading please consider subscribing to the RSS feed. If you have feedback or if you find the article useful please leave a comment below.

2 Responses to “Scrub your zfs file systems regularly”

  1. Edward O'Callaghan on July 10th, 2007 at 4:39 pm

    Thank you,
    I will use that on my Nexenta CP box ;)

    I am planning to do a OpenSolaris Magazine at moonshine.opn4.org , let me know if you would be interested to add a block each issue of things like this.. ?

    Regards,
    Edward.

  2. Your code:

    while sleep 60; do
    if ! scrub_in_progress $pool; then
    break
    fi
    done

    This might be cleaner and more readable:

    while scrub_in_progress $pool; do sleep 60; done;

Leave a Reply