This chapter presents most all the features needed to do Volume management. Most of the concepts apply equally well to both tape and disk Volumes. However, the chapter was originally written to explain backing up to disk, so you will see it is slanted in that direction, but all the directives presented here apply equally well whether your volume is disk or tape.
If you have a lot of hard disk storage or you absolutely must have your backups run within a small time window, you may want to direct Bacula to backup to disk Volumes rather than tape Volumes. This chapter is intended to give you some of the options that are available to you so that you can manage either disk or tape volumes.
Getting Bacula to write to disk rather than tape in the simplest case is rather easy. In the Storage daemon's configuration file, you simply define an Archive Device to be a directory. For example, if you want your disk backups to go into the directory /home/bacula/backups, you could use the following:
Device { Name = FileBackup Media Type = File Archive Device = /home/bacula/backups Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; }
Assuming you have the appropriate Storage resource in your Director's configuration file that references the above Device resource,
Storage { Name = FileStorage Address = ... Password = ... Device = FileBackup Media Type = File }
Bacula will then write the archive to the file /home/bacula/backups/<volume-name> where <volume-name> is the volume name of a Volume defined in the Pool. For example, if you have labeled a Volume named Vol001, Bacula will write to the file /home/bacula/backups/Vol001. Although you can later move the archive file to another directory, you should not rename it or it will become unreadable by Bacula. This is because each archive has the filename as part of the internal label, and the internal label must agree with the system filename before Bacula will use it.
Although this is quite simple, there are a number of problems. The first is that unless you specify otherwise, Bacula will always write to the same volume until you run out of disk space. This problem is addressed below.
In addition, if you want to use concurrent jobs that write to several different volumes at the same time, you will need to understand a number of other details. An example of such a configuration is given at the end of this chapter under Concurrent Disk Jobs.
Some of the options you have, all of which are specified in the Pool record, are:
UseVolumeOnce = yes.
Maximum Volume Jobs = nnn.
Maximum Volume Bytes = mmmm.
Note, if you use disk volumes, with all versions up to and including 1.39.28, you should probably limit the Volume size to some reasonable value such as say 5GB. This is because during a restore, Bacula is currently unable to seek to the proper place in a disk volume to restore a file, which means that it must read all records up to where the restore begins. If your Volumes are 50GB, reading half or more of the volume could take quite a bit of time. Also, if you ever have a partial hard disk failure, you are more likely to be able to recover more data if they are in smaller Volumes.
Volume Use Duration = ttt.
Note that although you probably would not want to limit the number of bytes on a tape as you would on a disk Volume, the other options can be very useful in limiting the time Bacula will use a particular Volume (be it tape or disk). For example, the above directives can allow you to ensure that you rotate through a set of daily Volumes if you wish.
As mentioned above, each of those directives is specified in the Pool or Pools that you use for your Volumes. In the case of Maximum Volume Job, Maximum Volume Bytes, and Volume Use Duration, you can actually specify the desired value on a Volume by Volume basis. The value specified in the Pool record becomes the default when labeling new Volumes. Once a Volume has been created, it gets its own copy of the Pool defaults, and subsequently changing the Pool will have no effect on existing Volumes. You can either manually change the Volume values, or refresh them from the Pool defaults using the update volume command in the Console. As an example of the use of one of the above, suppose your Pool resource contains:
Pool { Name = File Pool Type = Backup Volume Use Duration = 23h }
then if you run a backup once a day (every 24 hours), Bacula will use a new Volume for each backup, because each Volume it writes can only be used for 23 hours after the first write. Note, setting the use duration to 23 hours is not a very good solution for tapes unless you have someone on-site during the weekends, because Bacula will want a new Volume and no one will be present to mount it, so no weekend backups will be done until Monday morning.
Use of the above records brings up another problem -- that of labeling your Volumes. For automated disk backup, you can either manually label each of your Volumes, or you can have Bacula automatically label new Volumes when they are needed. While, the automatic Volume labeling in version 1.30 and prior is a bit simplistic, but it does allow for automation, the features added in version 1.31 permit automatic creation of a wide variety of labels including information from environment variables and special Bacula Counter variables. In version 1.37 and later, it is probably much better to use Python scripting and the NewVolume event since generating Volume labels in a Python script is much easier than trying to figure out Counter variables. See the Python Scripting chapter of this manual for more details.
Please note that automatic Volume labeling can also be used with tapes, but it is not nearly so practical since the tapes must be pre-mounted. This requires some user interaction. Automatic labeling from templates does NOT work with autochangers since Bacula will not access unknown slots. There are several methods of labeling all volumes in an autochanger magazine. For more information on this, please see the Autochanger chapter of this manual.
Automatic Volume labeling is enabled by making a change to both the Pool resource (Director) and to the Device resource (Storage daemon) shown above. In the case of the Pool resource, you must provide Bacula with a label format that it will use to create new names. In the simplest form, the label format is simply the Volume name, to which Bacula will append a four digit number. This number starts at 0001 and is incremented for each Volume the catalog contains. Thus if you modify your Pool resource to be:
Pool { Name = File Pool Type = Backup Volume Use Duration = 23h LabelFormat = "Vol" }
Bacula will create Volume names Vol0001, Vol0002, and so on when new Volumes are needed. Much more complex and elaborate labels can be created using variable expansion defined in the Variable Expansion chapter of this manual.
The second change that is necessary to make automatic labeling work is to give the Storage daemon permission to automatically label Volumes. Do so by adding LabelMedia = yes to the Device resource as follows:
Device { Name = File Media Type = File Archive Device = /home/bacula/backups Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; LabelMedia = yes }
You can find more details of the Label Format Pool record in Label Format description of the Pool resource records.
Automatic labeling discussed above brings up the problem of Volume management. With the above scheme, a new Volume will be created every day. If you have not specified Retention periods, your Catalog will continue to fill keeping track of all the files Bacula has backed up, and this procedure will create one new archive file (Volume) every day.
The tools Bacula gives you to help automatically manage these problems are the following:
The first three records (File Retention, Job Retention, and AutoPrune) determine the amount of time that Job and File records will remain in your Catalog, and they are discussed in detail in the Automatic Volume Recycling chapter of this manual.
Volume Retention, AutoPrune, and Recycle determine how long Bacula will keep your Volumes before reusing them, and they are also discussed in detail in the Automatic Volume Recycling chapter of this manual.
The Maximum Volumes record can also be used in conjunction with the Volume Retention period to limit the total number of archive Volumes (files) that Bacula will create. By setting an appropriate Volume Retention period, a Volume will be purged just before it is needed and thus Bacula can cycle through a fixed set of Volumes. Cycling through a fixed set of Volumes can also be done by setting Recycle Oldest Volume = yes or Recycle Current Volume = yes. In this case, when Bacula needs a new Volume, it will prune the specified volume.
Now suppose you want to use multiple Pools, which means multiple Volumes, or suppose you want each client to have its own Volume and perhaps its own directory such as /home/bacula/client1 and /home/bacula/client2 ... With the single Storage and Device definition above, neither of these two is possible. Why? Because Bacula disk storage follows the same rules as tape devices. Only one Volume can be mounted on any Device at any time. If you want to simultaneously write multiple Volumes, you will need multiple Device resources in your bacula-sd.conf file, and thus multiple Storage resources in your bacula-dir.conf.
OK, so now you should understand that you need multiple Device definitions in the case of different directories or different Pools, but you also need to know that the catalog data that Bacula keeps contains only the Media Type and not the specific storage device. This permits a tape for example to be re-read on any compatible tape drive. The compatibility being determined by the Media Type. The same applies to disk storage. Since a volume that is written by a Device in say directory /home/bacula/backups cannot be read by a Device with an Archive Device definition of /home/bacula/client1, you will not be able to restore all your files if you give both those devices Media Type = File. During the restore, Bacula will simply choose the first available device, which may not be the correct one. If this is confusing, just remember that the Directory has only the Media Type and the Volume name. It does not know the Archive Device (or the full path) that is specified in the Storage daemon. Thus you must explicitly tie your Volumes to the correct Device by using the Media Type.
The example shown below shows a case where there are two clients, each using its own Pool and storing their Volumes in different directories.
The following example is not very practical, but can be used to demonstrate the proof of concept in a relatively short period of time. The example consists of a two clients that are backed up to a set of 12 archive files (Volumes) for each client into different directories on the Storage machine. Each Volume is used (written) only once, and there are four Full saves done every hour (so the whole thing cycles around after three hours).
What is key here is that each physical device on the Storage daemon has a different Media Type. This allows the Director to choose the correct device for restores ...
The Director's configuration file is as follows:
Director { Name = my-dir QueryFile = "~/bacula/bin/query.sql" PidDirectory = "~/bacula/working" WorkingDirectory = "~/bacula/working" Password = dir_password } Schedule { Name = "FourPerHour" Run = Level=Full hourly at 0:05 Run = Level=Full hourly at 0:20 Run = Level=Full hourly at 0:35 Run = Level=Full hourly at 0:50 } Job { Name = "RecycleExample" Type = Backup Level = Full Client = Rufus FileSet= "Example FileSet" Messages = Standard Storage = FileStorage Pool = Recycle Schedule = FourPerHour } Job { Name = "RecycleExample2" Type = Backup Level = Full Client = Roxie FileSet= "Example FileSet" Messages = Standard Storage = FileStorage1 Pool = Recycle1 Schedule = FourPerHour } FileSet { Name = "Example FileSet" Include { Options { compression=GZIP signature=SHA1 } File = /home/kern/bacula/bin } } Client { Name = Rufus Address = rufus Catalog = BackupDB Password = client_password } Client { Name = Roxie Address = roxie Catalog = BackupDB Password = client1_password } Storage { Name = FileStorage Address = rufus Password = local_storage_password Device = RecycleDir Media Type = File } Storage { Name = FileStorage1 Address = rufus Password = local_storage_password Device = RecycleDir1 Media Type = File1 } Catalog { Name = BackupDB dbname = bacula; user = bacula; password = "" } Messages { Name = Standard ... } Pool { Name = Recycle Use Volume Once = yes Pool Type = Backup LabelFormat = "Recycle-" AutoPrune = yes VolumeRetention = 2h Maximum Volumes = 12 Recycle = yes } Pool { Name = Recycle1 Use Volume Once = yes Pool Type = Backup LabelFormat = "Recycle1-" AutoPrune = yes VolumeRetention = 2h Maximum Volumes = 12 Recycle = yes }
and the Storage daemon's configuration file is:
Storage { Name = my-sd WorkingDirectory = "~/bacula/working" Pid Directory = "~/bacula/working" MaximumConcurrentJobs = 10 } Director { Name = my-dir Password = local_storage_password } Device { Name = RecycleDir Media Type = File Archive Device = /home/bacula/backups LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } Device { Name = RecycleDir1 Media Type = File1 Archive Device = /home/bacula/backups1 LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } Messages { Name = Standard director = my-dir = all }
With a little bit of work, you can change the above example into a weekly or monthly cycle (take care about the amount of archive disk space used).
Bacula can, of course, use multiple disks, but in general, each disk must be a separate Device specification in the Storage daemon's conf file, and you must then select what clients to backup to each disk. You will also want to give each Device specification a different Media Type so that during a restore, Bacula will be able to find the appropriate drive.
The situation is a bit more complicated if you want to treat two different physical disk drives (or partitions) logically as a single drive, which Bacula does not directly support. However, it is possible to back up your data to multiple disks as if they were a single drive by linking the Volumes from the first disk to the second disk.
For example, assume that you have two disks named /disk1 and /disk2. If you then create a standard Storage daemon Device resource for backing up to the first disk, it will look like the following:
Device { Name = client1 Media Type = File Archive Device = /disk1 LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; }
Since there is no way to get the above Device resource to reference both /disk1 and /disk2 we do it by pre-creating Volumes on /disk2 with the following:
ln -s /disk2/Disk2-vol001 /disk1/Disk2-vol001 ln -s /disk2/Disk2-vol002 /disk1/Disk2-vol002 ln -s /disk2/Disk2-vol003 /disk1/Disk2-vol003 ...
At this point, you can label the Volumes as Volume Disk2-vol001, Disk2-vol002, ... and Bacula will use them as if they were on /disk1 but actually write the data to /disk2. The only minor inconvenience with this method is that you must explicitly name the disks and cannot use automatic labeling unless you arrange to have the labels exactly match the links you have created.
An important thing to know is that Bacula treats disks like tape drives as much as it can. This means that you can only have a single Volume mounted at one time on a disk as defined in your Device resource in the Storage daemon's conf file. You can have multiple concurrent jobs running that all write to the one Volume that is being used, but if you want to have multiple concurrent jobs that are writing to separate disks drives (or partitions), you will need to define separate Device resources for each one, exactly as you would do for two different tape drives. There is one fundamental difference, however. The Volumes that you create on the two drives cannot be easily exchanged as they can for a tape drive, because they are physically resident (already mounted in a sense) on the particular drive. As a consequence, you will probably want to give them different Media Types so that Bacula can distinguish what Device resource to use during a restore. An example would be the following:
Device { Name = Disk1 Media Type = File1 Archive Device = /disk1 LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } Device { Name = Disk2 Media Type = File2 Archive Device = /disk2 LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; }
With the above device definitions, you can run two concurrent jobs each writing at the same time, one to /disk1 and the other to /disk2. The fact that you have given them different Media Types will allow Bacula to quickly choose the correct Storage resource in the Director when doing a restore.
If we take the above example and add a second Client, here are a few considerations:
In this example, we have two clients, each with a different Pool and a different number of archive files retained. They also write to different directories with different Volume labeling.
The Director's configuration file is as follows:
Director { Name = my-dir QueryFile = "~/bacula/bin/query.sql" PidDirectory = "~/bacula/working" WorkingDirectory = "~/bacula/working" Password = dir_password } # Basic weekly schedule Schedule { Name = "WeeklySchedule" Run = Level=Full fri at 1:30 Run = Level=Incremental sat-thu at 1:30 } FileSet { Name = "Example FileSet" Include { Options { compression=GZIP signature=SHA1 } File = /home/kern/bacula/bin } } Job { Name = "Backup-client1" Type = Backup Level = Full Client = client1 FileSet= "Example FileSet" Messages = Standard Storage = File1 Pool = client1 Schedule = "WeeklySchedule" } Job { Name = "Backup-client2" Type = Backup Level = Full Client = client2 FileSet= "Example FileSet" Messages = Standard Storage = File2 Pool = client2 Schedule = "WeeklySchedule" } Client { Name = client1 Address = client1 Catalog = BackupDB Password = client1_password File Retention = 7d } Client { Name = client2 Address = client2 Catalog = BackupDB Password = client2_password } # Two Storage definitions with different Media Types # permits different directories Storage { Name = File1 Address = rufus Password = local_storage_password Device = client1 Media Type = File1 } Storage { Name = File2 Address = rufus Password = local_storage_password Device = client2 Media Type = File2 } Catalog { Name = BackupDB dbname = bacula; user = bacula; password = "" } Messages { Name = Standard ... } # Two pools permits different cycling periods and Volume names # Cycle through 15 Volumes (two weeks) Pool { Name = client1 Use Volume Once = yes Pool Type = Backup LabelFormat = "Client1-" AutoPrune = yes VolumeRetention = 13d Maximum Volumes = 15 Recycle = yes } # Cycle through 8 Volumes (1 week) Pool { Name = client2 Use Volume Once = yes Pool Type = Backup LabelFormat = "Client2-" AutoPrune = yes VolumeRetention = 6d Maximum Volumes = 8 Recycle = yes }
and the Storage daemon's configuration file is:
Storage { Name = my-sd WorkingDirectory = "~/bacula/working" Pid Directory = "~/bacula/working" MaximumConcurrentJobs = 10 } Director { Name = my-dir Password = local_storage_password } # Archive directory for Client1 Device { Name = client1 Media Type = File1 Archive Device = /home/bacula/client1 LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } # Archive directory for Client2 Device { Name = client2 Media Type = File2 Archive Device = /home/bacula/client2 LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } Messages { Name = Standard director = my-dir = all }
Kern Sibbald 2009-08-09