18.12 Backup Basics

The three major backup programs are dump(8), tar(1), and cpio(1).

18.12.1 Dump and Restore

The traditional UNIX® backup programs are dump and restore. They operate on the drive as a collection of disk blocks, below the abstractions of files, links and directories that are created by the file systems. Unlike other backup software, dump backs up an entire file system on a device. It is unable to backup only part of a file system or a directory tree that spans more than one file system. The dump command does not write files and directories to tape, but rather writes the raw data blocks that comprise files and directories. When being used to extract data, restore stores temporary files in /tmp/ by default -- if you are operating from a recovery disk with a small /tmp directory, you may need to set the TMPDIR environment variable to a directory with more free space for the restore to be successful.

Note: If you use dump on your root directory, you would not back up /home, /usr or many other directories since these are typically mount points for other file systems or symbolic links into those file systems.

dump has quirks that remain from its early days in Version 6 of AT&T UNIX (circa 1975). The default parameters are suitable for 9-track tapes (6250 bpi), not the high-density media available today (up to 62,182 ftpi). These defaults must be overridden on the command line to utilize the capacity of current tape drives.

It is also possible to backup data across the network to a tape drive attached to another computer with rdump and rrestore. Both programs rely upon rcmd(3) and ruserok(3) to access the remote tape drive. Therefore, the user performing the backup must be listed in the .rhosts file on the remote computer. The arguments to rdump and rrestore must be suitable to use on the remote computer. When rdumping from a FreeBSD computer to an Exabyte tape drive connected to a Sun called komodo, use:

# /sbin/rdump 0dsbfu 54000 13000 126 komodo:/dev/nsa8 /dev/da0a 2>&1

Beware: there are security implications to allowing .rhosts authentication. Evaluate your situation carefully.

It is also possible to use dump and restore in a more secure fashion over ssh.

Example 18-1. Using dump over ssh

# /sbin/dump -0uan -f - /usr | gzip -2 | ssh -c blowfish \
          [email protected] dd of=/mybigfiles/dump-usr-l0.gz

Or using dump's built-in method, setting the environment variable RSH:

Example 18-2. Using dump over ssh with RSH set

# RSH=/usr/bin/ssh /sbin/dump -0uan -f [email protected]:/dev/sa0 /usr

18.12.2 tar

tar(1) also dates back to Version 6 of AT&T UNIX (circa 1975). tar operates in cooperation with the file system; it writes files and directories to tape. tar does not support the full range of options that are available from cpio(1), but it does not require the unusual command pipeline that cpio uses.

To tar to an Exabyte tape drive connected to a Sun called komodo, use:

# tar cf - . | rsh komodo dd of=tape-device obs=20b

If you are worried about the security of backing up over a network you should use the ssh command instead of rsh.

18.12.3 cpio

cpio(1) is the original UNIX file interchange tape program for magnetic media. cpio has options (among many others) to perform byte-swapping, write a number of different archive formats, and pipe the data to other programs. This last feature makes cpio an excellent choice for installation media. cpio does not know how to walk the directory tree and a list of files must be provided through stdin.

cpio does not support backups across the network. You can use a pipeline and rsh to send the data to a remote tape drive.

# for f in directory_list; do
find $f >> backup.list
done
# cpio -v -o --format=newc < backup.list | ssh user@host "cat > backup_device"

Where directory_list is the list of directories you want to back up, user@host is the user/hostname combination that will be performing the backups, and backup_device is where the backups should be written to (e.g., /dev/nsa0).

18.12.4 pax

pax(1) is IEEE/POSIX®'s answer to tar and cpio. Over the years the various versions of tar and cpio have gotten slightly incompatible. So rather than fight it out to fully standardize them, POSIX created a new archive utility. pax attempts to read and write many of the various cpio and tar formats, plus new formats of its own. Its command set more resembles cpio than tar.

18.12.5 Amanda

Amanda (Advanced Maryland Network Disk Archiver) is a client/server backup system, rather than a single program. An Amanda server will backup to a single tape drive any number of computers that have Amanda clients and a network connection to the Amanda server. A common problem at sites with a number of large disks is that the length of time required to backup to data directly to tape exceeds the amount of time available for the task. Amanda solves this problem. Amanda can use a “holding disk” to backup several file systems at the same time. Amanda creates “archive sets”: a group of tapes used over a period of time to create full backups of all the file systems listed in Amanda's configuration file. The “archive set” also contains nightly incremental (or differential) backups of all the file systems. Restoring a damaged file system requires the most recent full backup and the incremental backups.

The configuration file provides fine control of backups and the network traffic that Amanda generates. Amanda will use any of the above backup programs to write the data to tape. Amanda is available as either a port or a package, it is not installed by default.

18.12.6 Do Nothing

“Do nothing” is not a computer program, but it is the most widely used backup strategy. There are no initial costs. There is no backup schedule to follow. Just say no. If something happens to your data, grin and bear it!

If your time and your data is worth little to nothing, then “Do nothing” is the most suitable backup program for your computer. But beware, UNIX is a useful tool, you may find that within six months you have a collection of files that are valuable to you.

“Do nothing” is the correct backup method for /usr/obj and other directory trees that can be exactly recreated by your computer. An example is the files that comprise the HTML or PostScript® version of this Handbook. These document formats have been created from SGML input files. Creating backups of the HTML or PostScript files is not necessary. The SGML files are backed up regularly.

18.12.7 Which Backup Program Is Best?

dump(8) Period. Elizabeth D. Zwicky torture tested all the backup programs discussed here. The clear choice for preserving all your data and all the peculiarities of UNIX file systems is dump. Elizabeth created file systems containing a large variety of unusual conditions (and some not so unusual ones) and tested each program by doing a backup and restore of those file systems. The peculiarities included: files with holes, files with holes and a block of nulls, files with funny characters in their names, unreadable and unwritable files, devices, files that change size during the backup, files that are created/deleted during the backup and more. She presented the results at LISA V in Oct. 1991. See torture-testing Backup and Archive Programs.

18.12.8 Emergency Restore Procedure

18.12.8.1 Before the Disaster

There are only four steps that you need to perform in preparation for any disaster that may occur.

First, print the bsdlabel from each of your disks (e.g. bsdlabel da0 | lpr), your file system table (/etc/fstab) and all boot messages, two copies of each.

Second, burn a “livefs” CDROM. This CDROM contains support for booting into a FreeBSD “livefs” rescue mode allowing the user to perform many tasks like running dump(8), restore(8), fdisk(8), bsdlabel(8), newfs(8), mount(8), and more. Livefs CD image for FreeBSD/i386 8.1-RELEASE is available from ftp://ftp.FreeBSD.org/pub/FreeBSD/releases/i386/ISO-IMAGES/8.1/FreeBSD-8.1-RELEASE-i386-livefs.iso.

Third, create backup tapes regularly. Any changes that you make after your last backup may be irretrievably lost. Write-protect the backup tapes.

Fourth, test the “livefs” CDROM you made in step two and backup tapes. Make notes of the procedure. Store these notes with the CDROM, the printouts and the backup tapes. You will be so distraught when restoring that the notes may prevent you from destroying your backup tapes (How? In place of tar xvf /dev/sa0, you might accidentally type tar cvf /dev/sa0 and over-write your backup tape).

For an added measure of security, make “livefs” CDROM and two backup tapes each time. Store one of each at a remote location. A remote location is NOT the basement of the same office building. A number of firms in the World Trade Center learned this lesson the hard way. A remote location should be physically separated from your computers and disk drives by a significant distance.

18.12.8.2 After the Disaster

The key question is: did your hardware survive? You have been doing regular backups so there is no need to worry about the software.

If the hardware has been damaged, the parts should be replaced before attempting to use the computer.

If your hardware is okay, insert the “livefs” CDROM in the CDROM drive and boot the computer. The original install menu will be displayed on the screen. Select the correct country, then choose Fixit -- Repair mode with CDROM/DVD/floppy or start a shell. option and select the CDROM/DVD -- Use the live filesystem CDROM/DVD item. The tool restore and the other programs that you need are located in /mnt2/rescue.

Recover each file system separately.

Try to mount (e.g. mount /dev/da0a /mnt) the root partition of your first disk. If the bsdlabel was damaged, use bsdlabel to re-partition and label the disk to match the label that you printed and saved. Use newfs to re-create the file systems. Re-mount the root partition of the disk read-write (mount -u -o rw /mnt). Use your backup program and backup tapes to recover the data for this file system (e.g. restore vrf /dev/sa0). Unmount the file system (e.g. umount /mnt). Repeat for each file system that was damaged.

Once your system is running, backup your data onto new tapes. Whatever caused the crash or data loss may strike again. Another hour spent now may save you from further distress later.