We’ve covered most of the authentication and authorization options for Apache and mod_dav_svn. But Apache provides a few other nice features.
One of the most useful benefits of an Apache/WebDAV
configuration for your Subversion repository is that the youngest
revisions of your versioned files and directories are immediately
available for viewing via a regular web browser. Since Subversion uses
URLs to identify versioned resources, those URLs used for HTTP-based
repository access can be typed directly into a web browser. Your
browser will issue an HTTP GET
request for that URL; based on whether that URL represents a versioned
directory or file, mod_dav_svn will
respond with a directory listing or with file contents.
Since the URLs do not contain any information about which version of the resource you wish to see, mod_dav_svn will always answer with the youngest version. This functionality has the wonderful side effect that you can pass around Subversion URLs to your peers as references to documents, and those URLs will always point at the latest manifestation of that document. Of course, you can even use the URLs as hyperlinks from other web sites, too.
When browsing a Subversion repository, the web browser gets a
clue about how to render a file’s contents by looking at
the Content-Type:
header
returned in Apache’s response to the HTTP GET
request. The value of this header is some sort of MIME type. By
default, Apache will tell the web browsers that all repository files
are of the “default” MIME type, typically text/plain
. This can be frustrating,
however, if a user wishes repository files to render as something
more meaningful—for example, it might be nice to have a foo.html file in the repository actually
render as HTML when browsing.
To make this happen, you need only make sure that your files
have the proper svn:mime-type
set. We discuss this in more
detail in File Content Type,
and you can even configure your client to automatically attach
proper svn:mime-type
properties
to files entering the repository for the first time; see Automatic Property Setting.
So, in our example, if one were to set the svn:mime-type
property to text/html
on file foo.html, Apache would properly tell your
web browser to render the file as HTML. One could also attach proper
image/*
MIME-type properties to
image files and ultimately get an entire web site to be viewable
directly from a repository! There’s generally no problem with this,
as long as the web site doesn’t contain any dynamically generated
content.
You generally will get more use out of URLs to versioned
files—after all, that’s where the interesting content tends to lie.
But you might have occasion to browse a Subversion directory
listing, where you’ll quickly note that the generated HTML used to
display that listing is very basic, and certainly not intended to be aesthetically pleasing (or even
interesting). To enable customization of these directory displays,
Subversion provides an XML index feature. A single SVNIndexXSLT
directive in your repository’s Location
block of httpd.conf will instruct mod_dav_svn to generate XML output when
displaying a directory listing, and to reference the XSLT stylesheet
of your choice:
<Location /svn> DAV svn SVNParentPath /var/svn SVNIndexXSLT "/svnindex.xsl" ... </Location>
Using the SVNIndexXSLT
directive and a creative XSLT stylesheet, you can make your
directory listings match the color schemes and imagery used in other
parts of your web site. Or, if you’d prefer, you can use the sample
stylesheets provided in the Subversion source distribution’s
tools/xslt/ directory. Keep in
mind that the path provided to the SVNIndexXSLT
directory is actually a URL
path—browsers need to be able to read your stylesheets to make use
of them!
If you’re serving a collection of repositories from a single
URL via the SVNParentPath
directive, then it’s also possible to have Apache display all
available repositories to a web browser. Just activate the SVNListParentPath
directive:
<Location /svn> DAV svn SVNParentPath /var/svn SVNListParentPath on ... </Location>
If a user now points her web browser to the URL http://host.example.com/svn/
, she’ll see a
list of all Subversion repositories sitting in /var/svn. Obviously, this can be a
security problem, so this
feature is turned off by default.
Because Apache is an HTTP server at heart, it contains fantastically flexible logging features. It’s beyond the scope of this book to discuss all of the ways logging can be configured, but we should point out that even the most generic httpd.conf file will cause Apache to produce two logs: error_log and access_log. These logs may appear in different places, but they are typically created in the logging area of your Apache installation. (On Unix, they often live in /usr/local/apache2/logs/.)
The error_log describes any internal errors that Apache runs into as it works. The access_log file records every incoming HTTP request received by Apache. This makes it easy to see, for example, which IP addresses Subversion clients are coming from, how often particular clients use the server, which users are authenticating properly, and which requests succeed or fail.
Unfortunately, because HTTP is a stateless protocol, even the
simplest Subversion client operation generates multiple network
requests. It’s very difficult to look at the access_log and deduce what the client was
doing—most operations look like a series of cryptic PROPPATCH
, GET
, PUT
,
and REPORT
requests. To make things
worse, many client operations send nearly identical series of
requests, so it’s even harder to tell them apart.
mod_dav_svn, however, can come to your aid. By activating an “operational logging” feature, you can ask mod_dav_svn to create a separate logfile describing what sort of high-level operations your clients are performing.
To do this, you need to make use of Apache’s CustomLog
directive (which is explained in
more detail in Apache’s own documentation). Be sure to invoke this
directive outside your Subversion Location
block:
<Location /svn> DAV svn ... </Location> CustomLog logs/svn_logfile "%t %u %{SVN-ACTION}e" env=SVN-ACTION
In this example, we’re asking Apache to create a special
logfile, svn_logfile, in the
standard Apache logs directory.
The %t
and %u
variables are replaced by the time and
username of the request, respectively. The really important parts are
the two instances of SVN-ACTION
.
When Apache sees that variable, it substitutes the value of the
SVN-ACTION
environment variable,
which is automatically set by mod_dav_svn whenever it detects a high-level
client action.
So, instead of having to interpret a traditional access_log like this:
[26/Jan/2007:22:25:29 -0600] "PROPFIND /svn/calc/!svn/vcc/default HTTP/1.1" 207 398 [26/Jan/2007:22:25:29 -0600] "PROPFIND /svn/calc/!svn/bln/59 HTTP/1.1" 207 449 [26/Jan/2007:22:25:29 -0600] "PROPFIND /svn/calc HTTP/1.1" 207 647 [26/Jan/2007:22:25:29 -0600] "REPORT /svn/calc/!svn/vcc/default HTTP/1.1" 200 607 [26/Jan/2007:22:25:31 -0600] "OPTIONS /svn/calc HTTP/1.1" 200 188 [26/Jan/2007:22:25:31 -0600] "MKACTIVITY /svn/calc/!svn/act/ e6035ef7-5df0-4ac0-b811-4be7c823f998 HTTP/1.1" 201 227 ...
you can peruse a much more intelligible svn_logfile like this:
[26/Jan/2007:22:24:20 -0600] - get-dir /tags r1729 props [26/Jan/2007:22:24:27 -0600] - update /trunk r1729 depth=infinity send-copyfrom-args [26/Jan/2007:22:25:29 -0600] - status /trunk/foo r1729 depth=infinity [26/Jan/2007:22:25:31 -0600] sally commit r1730
For an exhaustive list of all actions logged, see High-Level Logging.
One of the nice advantages of using Apache as a Subversion server is that it can be set up for simple replication. For example, suppose that your team is distributed across four offices around the globe. The Subversion repository can exist only in one of those offices, which means the other three offices will not enjoy accessing it—they’re likely to experience significantly slower traffic and response times when updating and committing code. A powerful solution is to set up a system consisting of one master Apache server and several slave Apache servers. If you place a slave server in each office, users can check out a working copy from whichever slave is closest to them. All read requests go to their local slave. Write requests get automatically routed to the single master server. When the commit completes, the master then automatically “pushes” the new revision to each slave server using the svnsync replication tool.
This configuration creates a huge perceptual speed increase for your users, because Subversion client traffic is typically 80–90% read requests. And if those requests are coming from a local server, it’s a huge win.
In this section, we’ll walk you through a standard setup of this single-master/multiple-slave system. However, keep in mind that your servers must be running at least Apache 2.2.0 (with mod_proxy loaded) and Subversion 1.5 (mod_dav_svn).
First, configure your master server’s httpd.conf file in the usual way. Make the repository available at a
certain URI location, and configure authentication and authorization
however you’d like. After that’s done, configure each of your
“slave” servers in the exact same way, but add the
special SVNMasterURI
directive to
the block:
<Location /svn> DAV svn SVNPath /var/svn/repos SVNMasterURI http://master.example.com/svn ... </Location>
This new directive tells a slave server to redirect all write requests to the master. (This is done automatically via Apache’s mod_proxy module.) Ordinary read requests, however, are still serviced by the slaves. Be sure that your master and slave servers all have matching authentication and authorization configurations; if they fall out of sync, it can lead to big headaches.
Next, we need to deal with the problem of infinite recursion. With the current configuration, imagine what will happen when a Subversion client performs a commit to the master server. After the commit completes, the server uses svnsync to replicate the new revision to each slave. But because svnsync appears to be just another Subversion client performing a commit, the slave will immediately attempt to proxy the incoming write request back to the master! Hilarity ensues.
The solution to this problem is to have the master push
revisions to a different Location
on the slaves. This location
is configured to not proxy write requests at
all, but to accept normal commits from (and only from) the master’s
IP address:
<Location /svn-proxy-sync> DAV svn SVNPath /var/svn/repos Order deny,allow Deny from all # Only let the server's IP address access this Location: Allow from 10.20.30.40 ... </Location>
Now that you’ve configured your Location
blocks on master and slaves, you
need to configure the master to replicate to the slaves. This is
done the usual way—using
svnsync. If you’re not familiar
with this tool, see Repository Replication for details.
First, make sure that each slave repository has a
pre-revprop-change
hook script that allows remote
revision property changes. (This is standard procedure for being on
the receiving end of svnsync.)
Then, log into the master server and configure each of the slave
repository URIs to receive data from the master repository on the
local disk:
$ svnsync init http://slave1.example.com/svn-proxy-sync file://var/svn/repos Copied properties for revision 0. $ svnsync init http://slave2.example.com/svn-proxy-sync file://var/svn/repos Copied properties for revision 0. $ svnsync init http://slave3.example.com/svn-proxy-sync file://var/svn/repos Copied properties for revision 0. # Perform the initial replication $ svnsync sync http://slave1.example.com/svn-proxy-sync Transmitting file data .... Committed revision 1. Copied properties for revision 1. Transmitting file data ....... Committed revision 2. Copied properties for revision 2. ... $ svnsync sync http://slave2.example.com/svn-proxy-sync Transmitting file data .... Committed revision 1. Copied properties for revision 1. Transmitting file data ....... Committed revision 2. Copied properties for revision 2. ... $ svnsync sync http://slave3.example.com/svn-proxy-sync Transmitting file data .... Committed revision 1. Copied properties for revision 1. Transmitting file data ....... Committed revision 2. Copied properties for revision 2. ...
After this is done, configure the master server’s post-commit
hook script to invoke svnsync on each slave server:
#!/bin/sh # Post-commit script to replicate newly committed revision to slaves svnsync sync http://slave1.example.com/svn-proxy-sync > /dev/null 2>&1 svnsync sync http://slave2.example.com/svn-proxy-sync > /dev/null 2>&1 svnsync sync http://slave3.example.com/svn-proxy-sync > /dev/null 2>&1
The extra bits on the end of each line aren’t necessary, but
they’re a sneaky way to allow the sync commands to run in the
background so that the Subversion client isn’t left waiting forever
for the commit to finish. In addition to this post-commit
hook, you’ll need a post-revprop-change
hook as well so that
when a user, say, modifies a log message, the slave servers get that
change also:
#!/bin/sh # Post-revprop-change script to replicate revprop-changes to slaves REV=${2} svnsync copy-revprops http://slave1.example.com/svn-proxy-sync ${REV} > /dev/null 2>&1 svnsync copy-revprops http://slave2.example.com/svn-proxy-sync ${REV} > /dev/null 2>&1 svnsync copy-revprops http://slave3.example.com/svn-proxy-sync ${REV} > /dev/null 2>&1
The only thing we’ve left out here is what to do about locks.
Because locks are strictly enforced by the master server (the only
place where commits happen), we don’t technically need to do
anything. Many teams don’t use Subversion’s locking features at all,
so it may be a nonissue for you. However, if lock changes aren’t
replicated from master to slaves, it means that clients won’t be
able to query the status of locks (e.g., svn status
-u
will show no information about repository locks). If
this bothers you, you can write post-lock
and post-unlock
hook scripts that run svn lock and svn
unlock on each slave machine, presumably through a remote
shell method such as SSH. That’s left as an exercise for the
reader!
Your master/slave replication system should now be ready to use. However, a couple of words of warning are in order. Remember that this replication isn’t entirely robust in the face of computer or network crashes. For example, if one of the automated svnsync commands fails to complete for some reason, the slaves will begin to fall behind. For example, your remote users will see that they’ve committed revision 100, but then when they run svn update, their local server will tell them that revision 100 doesn’t yet exist! Of course, the problem will be automatically fixed the next time another commit happens and the subsequent svnsync is successful—the sync will replicate all waiting revisions. But still, you may want to set up some sort of out-of-band monitoring to notice synchronization failures and force svnsync to run when things go wrong.
Several of the features already provided by Apache in its role
as a robust web server can be leveraged for increased functionality or
security in Subversion as well. The Subversion client is able to use
SSL (the Secure Sockets Layer, discussed earlier). If your Subversion
client is built to support SSL, it can access your Apache server
using https://
and
enjoy a high-quality encrypted network session.
Equally useful are other features of the Apache and Subversion relationship, such as the ability to specify a custom port (instead of the default HTTP port 80) or a virtual domain name by which the Subversion repository should be accessed, or the ability to access the repository through an HTTP proxy.
Finally, because mod_dav_svn is speaking a subset of the WebDAV/DeltaV protocol, it’s possible to access the repository via third-party DAV clients. Most modern operating systems (Win32, OS X, and Linux) have the built-in ability to mount a DAV server as a standard network “shared folder.” This is a complicated topic, but also wondrous when implemented. For details, read Appendix C.
Note that there are a number of other small tweaks one can make to mod_dav_svn that are too obscure to mention in this chapter. For a complete list of all httpd.conf directives that mod_dav_svn responds to, see Directives.