Table of Contents
disklist
This document contains notes on new features in Amanda 2.3 that may not yet be fully documented elsewhere.
Read more about this in the file named Indexing with Amanda.
Read more about this in the file named Backup PC hosts using Samba.
Amanda now supports dumps made via GnuTAR. To use this, set your dumptypes set the program name to "GNUTAR":
dumptype tar-client { .... program "GNUTAR" }
Since Gnu TAR does not maintain a dumpdates file itself, nor give an
estimate of backup size, those need to be done within Amanda. Amanda
maintains an /etc/amandates
file to track the backup dates analogously to
how dump does it.
NOTE: if your /etc directory is not writable by your dumpuser, you'll have
to create the empty file initially by hand, and make it writable by your
dumpuser ala /etc/dumpdates
.
NOTE: Since tar traverses the directory hierarchy and reads files as a regular user would, it must run as root. The two new Amanda programs calcsize and runtar therefore must be installed setuid root. I've made them as simple as possible to to avoid potential security holes.
A new "maxdumps" parameter for the conf file gives the default value for the amount of parallelism per client:
maxdumps 2 # default max num. dumps to do in parallel per client
If this default parameter is not specified, the default for the default :-0 is 1. Then, you can override the parameter per client through the dumptype, eg:
dumptype fast-client { .... maxdumps 4 }
If the "maxdumps" parameter isn't given in the dumptypes, the default is used. The idea is that maxdumps is set roughly proportional to the speed of the client host. You probably wont get much benefit from setting it very high, but all but the slowest hosts should be able to handle a maxdumps of at least 2.
Amanda doesn't really have any per-host parameters, just per-disk, so the per-client-host maxdumps is taken from the last disk listed for that host.
Just to make things more complicated, I've added the ability to specify a
"spindle number" for each filesystem in the disklist
file. For example:
wiggum / fast-comp-user 0 wiggum /usr fast-comp-user 0 wiggum /larry fast-comp-user 1 wiggum /curly fast-comp-user 1 wiggum /moe fast-comp-user 1 wiggum /itchy fast-comp-user 2 wiggum /scratchy fast-comp-user 3
The spindle number represents the disk number, eg every filesystem on sd0 can get a spindle number of 0, everything on sd1 gets spindle 1, etc (but there's no enforced requirement that there be a match with the underlying hardware situation). Now, even with a high maxdumps, Amanda will refrain from scheduling two disks on the same spindle at the same time, which would just slow them both down by adding a lot of seeks.
The default spindle if you don't specify one is -1, which is defined to be a spindle that doesn't interfere with itself. That is if you don't specify any spindle numbers, any and all filesystems on the host can be scheduled concurrently up to the maxdumps.
Just to be clear, there's no relation between spindle numbers and maxdumps: number the spindles by the disks that you have, even if that's more than maxdumps.
Also, I'm not sure that putting spindle numbers everywhere is of much value: their purpose is to prevent multiple big dumps from being run at the same time on two partitions on the same disk, on the theory that the extra seeking between the partitions would cause the dumps to run slower than they would if they ran sequentially. But, given the client-side compression and network output that must occur between blocks read from the disk, there may be enough slack time at the disk to support the seeks and have a little parallelism left over to do some good.
I've rewritten the taper - it now supports one run spanning multiple tapes if you have a tape-changer. The necessary changes in support of this have also been made to driver and reporter - planner already had support. There are a couple other places that should probably be updated, like amcheck. Dumps are not split across tapes - when taper runs into the end of a tape, it loads the next tape and tells driver to try sending the dump again.
If you are feeling brave, set "runtapes" to something other than 1.
The new taper also keeps the tape open the entire time it is writing the files out - no more having amchecks or other accesses/rewinds in the middle of the run screw you royally if they hit when the tape is closed for writing a filemark.
I've made some experimental changes to driver to determine what the bottleneck is at any time. Since Amanda tries to do many things at once, it's hard to pinpoint a single bottleneck, but I "think" I've got it down well enough to say something useful. For now it just outputs the current bottleneck as part of its "driver: state" line in the debug output, but once I'm comfortable with its conclusions, I'll output it to the log file and have the reporter generate a nice table. The current choices are:
I've fixed the 2-gig limits by representing sizes in Kbytes instead of bytes everywhere. This gives us a new 2 TB dump-file size limit (on 32bit machines), which should last us a couple more years. This seemed preferable to me than going to long-long or some other non-portable type for the size.
This document contains notes on new features in Amanda 2.2 that may not yet be fully documented elsewhere.
The new /etc/services
lines are:
amanda 10080/udp # bsd security Amanda daemon kamanda 10081/udp # krb4 security Amanda daemon
The new /etc/inetd.conf
lines are:
amanda dgram udp wait /usr/local/libexec/amanda/amandad amandad kamanda dgram udp wait /usr/local/libexec/amanda/amandad amandad -krb4
(you don't need the vanilla Amanda lines if you are using kerberos for everything, and vice-versa)
The new USE_VERSION_SUFFIXES define in options.h controls whether to install the Amanda executables with the version number attached to the name, eg "amdump-2.2.1". I recommend that you leave this defined, since this allows multiple versions to co-exist - particularly important while Amanda 2.2 is under development. You can always symlink the names without the version suffix to the version you want to be your "production" version.
Read the comments in Using Kerberos with Amanda for how to configure the kerberos version. With KRB4_SECURITY defined, there are two new dumptype options:
krb4-auth # use krb4 auth for this host # (you can mingle krb hosts & bsd .rhosts in one conf)
kencrypt # encrypt this filesystem over the net using the krb4 # session key. About 2x slower. Good for those root # partitions containing your keyfiles. Don't want to # give away the keys to an ethernet sniffer!
You can have more than one holding disk for those really big installations.
Just add extra diskdir and disksize lines to your amanda.conf
:
diskdir "/Amanda2/Amanda/work" # where the holding disk is disksize 880 MB # how much space can we use on it diskdir "/dumps/Amanda/work" # a second holding disk! disksize 1500 MB
Amanda will load-balance between the two disks as long as there is space. Amanda now also actually stats files to get a more accurate view of available and used disk space while running.
amcheck will now cause self-checks to run on the client hosts, quickly detecting which hosts are up and communicating, which have permissions problems, etc. This is amazingly fast for what it does: here it checks more than 130 hosts in less than a minute. My favorite gee-whiz new feature! The new -s and -c options control whether server-only or client-only checks are done.
System V shared memory primitives are no longer required on the server side, if your system has a version of mmap() that will allocate anonymous memory. BSD 4.4 systems (and OSF/1) have an explicitly anonymous mmap() type, but others (like SunOS) support the trick of mmap'ing /dev/zero for the same effect. Amanda should work with both varieties.
Defined HAVE_SYSVSHM or HAVE_MMAP (or both) in config.h. If you have both, SYSVSHM is selected (simply because this code in Amanda is more mature, not because the sysv stuff is better).
This was most requested feature #1; I've finally slipped it in. Define
HAVE_GZIP in options.h. See options.h-vanilla for details. There are two
new amanda.conf
dumptype options "compress-fast" and "compress-best".
The default is "compress-fast". With gzip, compress-fast seems to always do
better than the old lzw compress (in particular it will never expand the
file), and runs faster too. Gzip's compress-best does very good
compression, but is about twice as slow as the old lzw compress, so you
don't want to use it for filesystems that take a long time to full-dump
anyway.
Most requested feature #2: You can specify mount names in the disklist
instead of dev names. The rule is, if the filesystem name starts with a
slash, it is a mount point name, if it doesn't, it is a dev name, and has
DEVDIR prepended. For example:
obelix sd0a # dev-name: /dev/sd0a obelix /obelix # mount name: /obelix, aka /dev/sd0g
A new amanda.conf
parameter, tpchanger, controls whether Amanda
communicates with a tape changer program to load tapes rather than
just opening the tapedev itself. The tpchanger parameter is a string
which specifies the name of a program that follows the API specified
in Amanda Tape Changer Support.
Read that for more information.
An initial tape-changer glue script, chg-generic.sh, implements the Amanda changer API using an array of tape devices to simulate a tape changer, with the device names specified via a conf file. This script can be quickly customized by inserting calls tape-changer-specific programs at appropriate places, making support for new changers painless. If you know what command to execute to get your changer to put a particular tape in the drive, you can get Amanda to support your changer.
The generic script works as-is for sites that want to cascade between two or more tape drives hooked directly up to the tape server host. It also should work as-is with tape-changer drivers that use separate device names to specify the slot to be loaded, wheres simply opening the slot device causes the tape from that slot to be loaded.
chg-generic has its own small conf file. See example/chg-generic.conf for a documented sample.
amtape is the user front-end to the Amanda tape changer support facilities. The operators can use amtape to load tapes for restores, position the changer, see what Amanda tapes are loaded in the tape rack, and see which tape would be picked by taper for the next amdump run.
No man page yet, but running amtape with no arguments gives a detailed usage statement. See Amanda Tape Changer Support for more info.
The amlabel command now takes an optional slot argument for labeling particular tapes in the tape rack. See Amanda Tape Changer Support for more info.
The specs in Amanda Tape Changer Support have been updated, and the code changed to match. The major difference is that Amanda no longer assumes slots in the tape rack are numbered from 0 to N-1. They can be numbered or labeled in any manner that suits your tape-changer, Amanda doesn't care what the actual slot names are. Also added "first" and "last" slot specifiers, and an -eject command.
The chg-generic.sh tape changer script now has new "firstslot", "lastslot", and "needeject" parameters for the chg-generic.conf file. It now keeps track of whether the current slot is loaded into the drive, so that it can issue an explicit eject command for those tape changers that need one. See example/chg-generic.conf for more info.
I'm still holding back on support for multiple tapes in one run. I'm not yet completely happy with how Amanda should handle splitting dumps across tapes (eg when end-of-tape is encountered in the middle of a long dump). For example, this creates issues for amrestore, which currently doesn't know about configurations or tape changers --- on purpose, so that you can do restores on any machine with a tape drive, not just the server, and so that you can recover with no online databases present.
However, because the current snapshot DOES support tape changers, and multiple runs in one day, some of the benefit of multi-tape runs can be had by simply running Amanda several times in a row. Eg, to fill three tapes per night, you can put
amdump <conf>; amdump <conf>; amdump <conf>
in you crontab. On the down side, this will generate three reports instead of one, will do more incremental dumps than necessary, and will run slower. Not very satisfying, but if you need to fill more than one tape per day NOW, it should work.
The support for writing to multiple tapes in one run is almost finished now. See Multitape support in Amanda 2.2 for an outline of the design. The planner support for this is included in this snapshot, but the taper part is not.
There is a new amanda.conf
variable "runtapes" which specifies the number
of tapes to use on each amdump run. For now this should stay at 1, the
default. Also, the old "mincycle" and "maxcycle" amanda.conf
variables are deprecated, but still work for now.
"maxcycle" was never used, and
"mincycle" is now called "dumpcycle".
There are two visible differences in the new planner: First, planner now thinks in real-time, rather than by the number of tapes as before. That is, a filesystem is due for a full backup once every <dumpcycle> days, regardless of how many times Amanda is run in that interval. As a consequence, you need to make sure the dumpcycle variable marks real time instead of the number of days. For example, previously "mincycle 10" worked for a two week cycle if you ran amdump only on weekdays (for 10 runs in a cycle). Now a two week cycle must be specified as "dumpcycle 14" or "dumpcycle 2 weeks". The "2 weeks" specifier works with both the old and new versions of planner, because previously "weeks" multiplied by 5, and now it multiplies by 7.
Second, planner now warns about impending overwrites of full backups. If a filesystem's last full backup is on a tape that is due to be overwritten in the next 5 runs, planner will give you a heads-up about it, so that you can restore the filesystem somewhere, or switch that tape out of rotation (substitute a new tape with the same label). This situation often occurs after a hardware failure brings a machine or disk down for some days.
If there is no tape present (or the tape drive fails during dumping), Amanda switches to degraded mode. In degraded mode, level-0 dumps are not allowed. This can be a pain for unattended sites over the weekend (especially when there is a large holding disk that can hold any necessary dumps). Amanda now supports a new configuration file directive, "reserve". This tells Amanda to reserve that percentage of total holding disk space for degraded mode dumps. Example: your total holding disk space adds up to 8.4GB. If you specify a reserve of 50, 4.2GB (50%) of the holding disk space will be allowed to be used for regular dumps, but if that limit is hit, Amanda will switch to degraded dumps. For backward compatibility, if no 'reserve' keyword is present, 100 will be assumed (e.g. never do full dumps if degraded mode is in effect).