In the section called “Strategies for Repository Deployment”, we looked at some of the important decisions that should be made before creating and configuring your Subversion repository. Now, we finally get to get our hands dirty! In this section, we'll see how to actually create a Subversion repository and configure it to perform custom actions when special repository events occur.
Subversion repository creation is an incredibly simple
task. The svnadmin utility that comes with
Subversion provides a subcommand (create
)
for doing just that.
$ svnadmin create /path/to/repos
This creates a new repository in the directory
/path/to/repos
, and with the default
filesystem data store. Prior to Subversion 1.2, the default
was to use Berkeley DB; the default is now FSFS. You can
explicitly choose the filesystem type using the
--fs-type
argument, which accepts as a
parameter either fsfs
or
bdb
.
$ # Create an FSFS-backed repository $ svnadmin create --fs-type fsfs /path/to/repos $
# Create a Berkeley-DB-backed repository $ svnadmin create --fs-type bdb /path/to/repos $
After running this simple command, you have a Subversion repository.
The path argument to svnadmin is just
a regular filesystem path and not a URL like the
svn client program uses when referring to
repositories. Both svnadmin and
svnlook are considered server-side
utilities—they are used on the machine where the
repository resides to examine or modify aspects of the
repository, and are in fact unable to perform tasks across a
network. A common mistake made by Subversion newcomers is
trying to pass URLs (even “local”
file://
ones) to these two programs.
Present in the db/
subdirectory of
your repository is the implementation of the versioned
filesystem. Your new repository's versioned filesystem begins
life at revision 0, which is defined to consist of nothing but
the top-level root (/
) directory.
Initially, revision 0 also has a single revision property,
svn:date
, set to the time at which the
repository was created.
Now that you have a repository, it's time to customize it.
While some parts of a Subversion repository—such as the configuration files and hook scripts—are meant to be examined and modified manually, you shouldn't (and shouldn't need to) tamper with the other parts of the repository “by hand”. The svnadmin tool should be sufficient for any changes necessary to your repository, or you can look to third-party tools (such as Berkeley DB's tool suite) for tweaking relevant subsections of the repository. Do not attempt manual manipulation of your version control history by poking and prodding around in your repository's data store files!
A hook is a program triggered by some repository event, such as the creation of a new revision or the modification of an unversioned property. Some hooks (the so-called “pre hooks”) run in advance of a repository operation and provide a means by which to both report what is about to happen and to prevent it from happening at all. Other hooks (the “post hooks”) run after the completion of a repository event, and are useful for reporting purposes only. Each hook is handed enough information to tell what that event is (or was), the specific repository changes proposed (or completed), and the username of the person who triggered the event.
The hooks
subdirectory is, by
default, filled with templates for various repository
hooks.
$ ls repos/hooks/ post-commit.tmpl post-unlock.tmpl pre-revprop-change.tmpl post-lock.tmpl pre-commit.tmpl pre-unlock.tmpl post-revprop-change.tmpl pre-lock.tmpl start-commit.tmpl
There is one template for each hook that the Subversion
repository supports, and by examining the contents of those
template scripts, you can see what triggers each script
to run and what data is passed to that script. Also present
in many of these templates are examples of how one might use
that script, in conjunction with other Subversion-supplied
programs, to perform common useful tasks. To actually install
a working hook, you need only place some executable program or
script into the repos/hooks
directory
which can be executed as the name (like
start-commit or
post-commit) of the hook.
On Unix platforms, this means supplying a script or
program (which could be a shell script, a Python program, a
compiled C binary, or any number of other things) named
exactly like the name of the hook. Of course, the template
files are present for more than just informational
purposes—the easiest way to install a hook on Unix
platforms is to simply copy the appropriate template file to a
new file that lacks the .tmpl
extension,
customize the hook's contents, and ensure that the script is
executable. Windows, however, uses file extensions to
determine whether or not a program is executable, so you would
need to supply a program whose basename is the name of the
hook, and whose extension is one of the special extensions
recognized by Windows for executable programs, such as
.exe
or .com
for
programs, and .bat
for batch
files.
For security reasons, the Subversion repository executes
hook programs with an empty environment—that is, no
environment variables are set at all, not even
$PATH
(or %PATH%
,
under Windows). Because of this, many administrators
are baffled when their hook program runs fine by hand, but
doesn't work when run by Subversion. Be sure to explicitly
set any necessary environment variables in your hook program
and/or use absolute paths to programs.
Subversion will attempt to execute hooks as the same user who owns the process which is accessing the Subversion repository. In most cases, the repository is being accessed via a Subversion server, so this user is the same user as which that server runs on the system. The hooks themselves will need to be configured with OS-level permissions that allow that user to execute them. Also, this means that any file or programs (including the Subversion repository itself) accessed directly or indirectly by the hook will be accessed as the same user. In other words, be alert to potential permission-related problems that could prevent the hook from performing the tasks it is designed to perform.
There are nine hooks implemented by the Subversion repository, and you can get details about each of them in the section called “Repository Hooks”. As a repository administrator, you'll need to decide which of hooks you wish to implement (by way of providing an appropriately named and permissioned hook program), and how. This decision needs to be made with the bigger picture of how repository is deployed in mind. For example, if you are using server configuration stuffs to determine which usernames are permitted to commit changes to your repository, then you don't need to do this sort of access control via the hook system.
There is no shortage of Subversion hook programs and scripts freely available either from the Subversion community itself or elsewhere. These scripts cover a wide range of utility—basic access control, policy adherence checking, issue tracker integration, email- or syndication-based commit notification, and beyond. See Appendix D, Third Party Tools for discussion of some of the most commonly used hook programs. Or, if you wish to write your own, see Chapter 8, Embedding Subversion.
While hook scripts can be leveraged to do almost
anything, there is one dimension in which hook script
authors should show restraint: do not
modify a commit transaction using hook scripts. While it
might be tempting to use hook scripts to automatically
correct errors or shortcomings or policy violations present
in the files being committed, doing so can cause problems.
Subversion keeps client-side caches of certain bits of
repository data, and if you change a commit transaction in
this way, those caches become indetectably stale. This
inconsistency can lead to surprising and unexpected
behavior. Instead of modifying the transaction, you should
simply validate the transaction in the
pre-commit
hook and reject the commit
if it does not meet the desired requirements. As an added
bonus, your users will learn the value of careful,
compliance-minded work habits.
A Berkeley DB environment is an encapsulation of one or more databases, log files, region files and configuration files. The Berkeley DB environment has its own set of default configuration values for things like the number of database locks allowed to be taken out at any given time, or the maximum size of the journaling log files, etc. Subversion's filesystem logic additionally chooses default values for some of the Berkeley DB configuration options. However, sometimes your particular repository, with its unique collection of data and access patterns, might require a different set of configuration option values.
The producers of Berkeley DB understand that different
applications and database environments have different
requirements, and so they have provided a mechanism for
overriding at runtime many of the configuration values for the
Berkeley DB environment. Berkeley checks for the presence of
a file named DB_CONFIG
in the environment
directory, and parses the options found in that file for use
with that particular Berkeley DB environment.
The Berkeley DB configuration file for a BDB-backed
repository is located in the repository's
db
subdirectory, at
db/DB_CONFIG
. Subversion itself creates
this file when it creates the rest of the repository. The
file initially contains some default options, as well as
pointers to the Berkeley DB online documentation so you can
read about what those options do. Of course, you are free to
add any of the supported Berkeley DB options to your
DB_CONFIG
file. Just be aware that while
Subversion never attempts to read or interpret the contents of
the file, and makes no direct use of the option settings in
it, you'll want to avoid any configuration changes that may
cause Berkeley DB to behave in a fashion that is at odds with
what Subversion might expect. Also, changes made to
DB_CONFIG
won't take effect until you
recover the database environment (using svnadmin
recover).