Chapter 31. What once was new

Amanda Core Team

Original text
AMANDA Core Team

Stefan G. Weichinger

XML-conversion, Updates
AMANDA Core Team

Table of Contents

What's new in Amanda 2.3
Indexing backups for easier restore
Samba Support
GnuTar Support
Multiple backups in parallel from one client host
Multiple tapes in one run
Bottleneck determination
2 Gb limit removed
amadmin import/export
What's new in Amanda 2.2
Client side setup has changed
Version suffixes on executables
Kerberos
Multiple holding disks
Remote self-checks
mmap support
gzip support
Mount point names in disklist
Initial tape-changer support included
Generic tape changer wrapper script
New command amtape
Changer support added to command amlabel
Tape changer support improved
A few words about multi-tape runs
Big planner changes
Level-0 dumps allowed with no tape

What's new in Amanda 2.3

This document contains notes on new features in Amanda 2.3 that may not yet be fully documented elsewhere.

Indexing backups for easier restore

Read more about this in the file named Indexing with Amanda.

Samba Support

Read more about this in the file named Backup PC hosts using Samba.

GnuTar Support

Amanda now supports dumps made via GnuTAR. To use this, set your dumptypes set the program name to "GNUTAR":

dumptype tar-client {
....
program "GNUTAR"
}
	

Since Gnu TAR does not maintain a dumpdates file itself, nor give an estimate of backup size, those need to be done within Amanda. Amanda maintains an /etc/amandates file to track the backup dates analogously to how dump does it.

NOTE: if your /etc directory is not writable by your dumpuser, you'll have to create the empty file initially by hand, and make it writable by your dumpuser ala /etc/dumpdates.

NOTE: Since tar traverses the directory hierarchy and reads files as a regular user would, it must run as root. The two new Amanda programs calcsize and runtar therefore must be installed setuid root. I've made them as simple as possible to to avoid potential security holes.

Multiple backups in parallel from one client host

A new "maxdumps" parameter for the conf file gives the default value for the amount of parallelism per client:

maxdumps 2 # default max num. dumps to do in parallel per client
	

If this default parameter is not specified, the default for the default :-0 is 1. Then, you can override the parameter per client through the dumptype, eg:

dumptype fast-client {
....
maxdumps 4
}
	

If the "maxdumps" parameter isn't given in the dumptypes, the default is used. The idea is that maxdumps is set roughly proportional to the speed of the client host. You probably wont get much benefit from setting it very high, but all but the slowest hosts should be able to handle a maxdumps of at least 2.

Amanda doesn't really have any per-host parameters, just per-disk, so the per-client-host maxdumps is taken from the last disk listed for that host.

Just to make things more complicated, I've added the ability to specify a "spindle number" for each filesystem in the disklist file. For example:

        wiggum /        fast-comp-user  0
        wiggum /usr     fast-comp-user  0
        wiggum /larry   fast-comp-user  1
        wiggum /curly   fast-comp-user  1
        wiggum /moe     fast-comp-user  1
        wiggum /itchy   fast-comp-user  2
        wiggum /scratchy fast-comp-user 3
	

The spindle number represents the disk number, eg every filesystem on sd0 can get a spindle number of 0, everything on sd1 gets spindle 1, etc (but there's no enforced requirement that there be a match with the underlying hardware situation). Now, even with a high maxdumps, Amanda will refrain from scheduling two disks on the same spindle at the same time, which would just slow them both down by adding a lot of seeks.

The default spindle if you don't specify one is -1, which is defined to be a spindle that doesn't interfere with itself. That is if you don't specify any spindle numbers, any and all filesystems on the host can be scheduled concurrently up to the maxdumps.

Just to be clear, there's no relation between spindle numbers and maxdumps: number the spindles by the disks that you have, even if that's more than maxdumps.

Also, I'm not sure that putting spindle numbers everywhere is of much value: their purpose is to prevent multiple big dumps from being run at the same time on two partitions on the same disk, on the theory that the extra seeking between the partitions would cause the dumps to run slower than they would if they ran sequentially. But, given the client-side compression and network output that must occur between blocks read from the disk, there may be enough slack time at the disk to support the seeks and have a little parallelism left over to do some good.

Multiple tapes in one run

I've rewritten the taper - it now supports one run spanning multiple tapes if you have a tape-changer. The necessary changes in support of this have also been made to driver and reporter - planner already had support. There are a couple other places that should probably be updated, like amcheck. Dumps are not split across tapes - when taper runs into the end of a tape, it loads the next tape and tells driver to try sending the dump again.

If you are feeling brave, set "runtapes" to something other than 1.

The new taper also keeps the tape open the entire time it is writing the files out - no more having amchecks or other accesses/rewinds in the middle of the run screw you royally if they hit when the tape is closed for writing a filemark.

Bottleneck determination

I've made some experimental changes to driver to determine what the bottleneck is at any time. Since Amanda tries to do many things at once, it's hard to pinpoint a single bottleneck, but I "think" I've got it down well enough to say something useful. For now it just outputs the current bottleneck as part of its "driver: state" line in the debug output, but once I'm comfortable with its conclusions, I'll output it to the log file and have the reporter generate a nice table. The current choices are:

  • not-idle - if there were dumps to do, they got done
  • no-dumpers - there were dumps to do but no dumpers free
  • no-hold - there were dumps to do and dumpers free but the dumps
  • couldn't go to the holding disks (no-hold conf flag)
  • no-diskspace - no diskspace on holding disks
  • no-bandwidth - ran out of bandwidth
  • client-constrained - couldn't start any dumps because the clients were busy

2 Gb limit removed

I've fixed the 2-gig limits by representing sizes in Kbytes instead of bytes everywhere. This gives us a new 2 TB dump-file size limit (on 32bit machines), which should last us a couple more years. This seemed preferable to me than going to long-long or some other non-portable type for the size.

amadmin import/export

amadmin now has "import" and "export" commands, to convert the curinfo database to/from text format, for: moving an Amanda server to a different arch, compressing the database after deleting lots of hosts, or editing one or all entries in batch form or via a script.

What's new in Amanda 2.2

Note

Here's the old 2.2.x stuff from this file. I'm pretty sure most of this is in the mainline documentation already.

This document contains notes on new features in Amanda 2.2 that may not yet be fully documented elsewhere.

Client side setup has changed

The new /etc/services lines are:

amanda       10080/udp		# bsd security Amanda daemon
kamanda      10081/udp		# krb4 security Amanda daemon

The new /etc/inetd.conf lines are:

amanda  dgram udp wait /usr/local/libexec/amanda/amandad amandad
kamanda dgram udp wait /usr/local/libexec/amanda/amandad amandad -krb4

(you don't need the vanilla Amanda lines if you are using kerberos for everything, and vice-versa)

Version suffixes on executables

The new USE_VERSION_SUFFIXES define in options.h controls whether to install the Amanda executables with the version number attached to the name, eg "amdump-2.2.1". I recommend that you leave this defined, since this allows multiple versions to co-exist - particularly important while Amanda 2.2 is under development. You can always symlink the names without the version suffix to the version you want to be your "production" version.

Kerberos

Read the comments in Using Kerberos with Amanda for how to configure the kerberos version. With KRB4_SECURITY defined, there are two new dumptype options:

  • krb4-auth	# use krb4 auth for this host 
    		# (you can mingle krb hosts & bsd .rhosts in one conf)
    	
  • kencrypt	# encrypt this filesystem over the net using the krb4
    		# session key.  About 2x slower.  Good for those root
    		# partitions containing your keyfiles.  Don't want to
    		# give away the keys to an ethernet sniffer!
    	

Multiple holding disks

You can have more than one holding disk for those really big installations. Just add extra diskdir and disksize lines to your amanda.conf:

Note

sgw: This is OLD syntax now ...
	diskdir "/Amanda2/Amanda/work"  # where the holding disk is
	disksize 880 MB                 # how much space can we use on it

	diskdir "/dumps/Amanda/work"    # a second holding disk!
	disksize 1500 MB
	

Amanda will load-balance between the two disks as long as there is space. Amanda now also actually stats files to get a more accurate view of available and used disk space while running.

Remote self-checks

amcheck will now cause self-checks to run on the client hosts, quickly detecting which hosts are up and communicating, which have permissions problems, etc. This is amazingly fast for what it does: here it checks more than 130 hosts in less than a minute. My favorite gee-whiz new feature! The new -s and -c options control whether server-only or client-only checks are done.

mmap support

System V shared memory primitives are no longer required on the server side, if your system has a version of mmap() that will allocate anonymous memory. BSD 4.4 systems (and OSF/1) have an explicitly anonymous mmap() type, but others (like SunOS) support the trick of mmap'ing /dev/zero for the same effect. Amanda should work with both varieties.

Defined HAVE_SYSVSHM or HAVE_MMAP (or both) in config.h. If you have both, SYSVSHM is selected (simply because this code in Amanda is more mature, not because the sysv stuff is better).

gzip support

This was most requested feature #1; I've finally slipped it in. Define HAVE_GZIP in options.h. See options.h-vanilla for details. There are two new amanda.conf dumptype options "compress-fast" and "compress-best". The default is "compress-fast". With gzip, compress-fast seems to always do better than the old lzw compress (in particular it will never expand the file), and runs faster too. Gzip's compress-best does very good compression, but is about twice as slow as the old lzw compress, so you don't want to use it for filesystems that take a long time to full-dump anyway.

Mount point names in disklist

Most requested feature #2: You can specify mount names in the disklist instead of dev names. The rule is, if the filesystem name starts with a slash, it is a mount point name, if it doesn't, it is a dev name, and has DEVDIR prepended. For example:

	obelix	sd0a		# dev-name: /dev/sd0a
	obelix	/obelix		# mount name: /obelix, aka /dev/sd0g
	

Initial tape-changer support included

A new amanda.conf parameter, tpchanger, controls whether Amanda communicates with a tape changer program to load tapes rather than just opening the tapedev itself. The tpchanger parameter is a string which specifies the name of a program that follows the API specified in Amanda Tape Changer Support. Read that for more information.

Generic tape changer wrapper script

An initial tape-changer glue script, chg-generic.sh, implements the Amanda changer API using an array of tape devices to simulate a tape changer, with the device names specified via a conf file. This script can be quickly customized by inserting calls tape-changer-specific programs at appropriate places, making support for new changers painless. If you know what command to execute to get your changer to put a particular tape in the drive, you can get Amanda to support your changer.

The generic script works as-is for sites that want to cascade between two or more tape drives hooked directly up to the tape server host. It also should work as-is with tape-changer drivers that use separate device names to specify the slot to be loaded, wheres simply opening the slot device causes the tape from that slot to be loaded.

chg-generic has its own small conf file. See example/chg-generic.conf for a documented sample.

New command amtape

amtape is the user front-end to the Amanda tape changer support facilities. The operators can use amtape to load tapes for restores, position the changer, see what Amanda tapes are loaded in the tape rack, and see which tape would be picked by taper for the next amdump run.

No man page yet, but running amtape with no arguments gives a detailed usage statement. See Amanda Tape Changer Support for more info.

Note

sgw: The manpage exists now.

Changer support added to command amlabel

The amlabel command now takes an optional slot argument for labeling particular tapes in the tape rack. See Amanda Tape Changer Support for more info.

Tape changer support improved

The specs in Amanda Tape Changer Support have been updated, and the code changed to match. The major difference is that Amanda no longer assumes slots in the tape rack are numbered from 0 to N-1. They can be numbered or labeled in any manner that suits your tape-changer, Amanda doesn't care what the actual slot names are. Also added "first" and "last" slot specifiers, and an -eject command.

The chg-generic.sh tape changer script now has new "firstslot", "lastslot", and "needeject" parameters for the chg-generic.conf file. It now keeps track of whether the current slot is loaded into the drive, so that it can issue an explicit eject command for those tape changers that need one. See example/chg-generic.conf for more info.

A few words about multi-tape runs

I'm still holding back on support for multiple tapes in one run. I'm not yet completely happy with how Amanda should handle splitting dumps across tapes (eg when end-of-tape is encountered in the middle of a long dump). For example, this creates issues for amrestore, which currently doesn't know about configurations or tape changers --- on purpose, so that you can do restores on any machine with a tape drive, not just the server, and so that you can recover with no online databases present.

However, because the current snapshot DOES support tape changers, and multiple runs in one day, some of the benefit of multi-tape runs can be had by simply running Amanda several times in a row. Eg, to fill three tapes per night, you can put

amdump <conf>; amdump <conf>; amdump <conf>

in you crontab. On the down side, this will generate three reports instead of one, will do more incremental dumps than necessary, and will run slower. Not very satisfying, but if you need to fill more than one tape per day NOW, it should work.

Big planner changes

The support for writing to multiple tapes in one run is almost finished now. See Multitape support in Amanda 2.2 for an outline of the design. The planner support for this is included in this snapshot, but the taper part is not.

There is a new amanda.conf variable "runtapes" which specifies the number of tapes to use on each amdump run. For now this should stay at 1, the default. Also, the old "mincycle" and "maxcycle" amanda.conf variables are deprecated, but still work for now. "maxcycle" was never used, and "mincycle" is now called "dumpcycle".

There are two visible differences in the new planner: First, planner now thinks in real-time, rather than by the number of tapes as before. That is, a filesystem is due for a full backup once every <dumpcycle> days, regardless of how many times Amanda is run in that interval. As a consequence, you need to make sure the dumpcycle variable marks real time instead of the number of days. For example, previously "mincycle 10" worked for a two week cycle if you ran amdump only on weekdays (for 10 runs in a cycle). Now a two week cycle must be specified as "dumpcycle 14" or "dumpcycle 2 weeks". The "2 weeks" specifier works with both the old and new versions of planner, because previously "weeks" multiplied by 5, and now it multiplies by 7.

Second, planner now warns about impending overwrites of full backups. If a filesystem's last full backup is on a tape that is due to be overwritten in the next 5 runs, planner will give you a heads-up about it, so that you can restore the filesystem somewhere, or switch that tape out of rotation (substitute a new tape with the same label). This situation often occurs after a hardware failure brings a machine or disk down for some days.

Level-0 dumps allowed with no tape

If there is no tape present (or the tape drive fails during dumping), Amanda switches to degraded mode. In degraded mode, level-0 dumps are not allowed. This can be a pain for unattended sites over the weekend (especially when there is a large holding disk that can hold any necessary dumps). Amanda now supports a new configuration file directive, "reserve". This tells Amanda to reserve that percentage of total holding disk space for degraded mode dumps. Example: your total holding disk space adds up to 8.4GB. If you specify a reserve of 50, 4.2GB (50%) of the holding disk space will be allowed to be used for regular dumps, but if that limit is hit, Amanda will switch to degraded dumps. For backward compatibility, if no 'reserve' keyword is present, 100 will be assumed (e.g. never do full dumps if degraded mode is in effect).

Note

This percentage applies from run to run, so, as in the previous example, when Amanda runs the next day, if there is 3.8GB left on the holding disk, 1.9GB will be reserved for degraded mode dumps (e.g. the percentage keeps sliding).

Note

Refer to http://www.amanda.org/docs/whatwasnew.html for the current version of this document.