Svn – Subversion cluster / replication / HA

gitreplicationsvnversion control

What's a good way of making SVN highly available?

We take backups of our Subversion server, but this is not enough, as svn is very important.
We would like to implement some HA mechanmism for Subversion

There is a commercial solution (http://www.wandisco.com/subversion/clustering/)
but maybe I can use another mechanism or even git with svn compability?

Best Answer

You're going to want to use svnsync.

The Subversion book is a good reference for this.

Also, you may be interested in what the Apache Software Foundation is doing.

Related Solutions

Subversion Backup – Best Practices for SVN Backups on Debian

Look for the svn-hot-backup script. It should ship with subversion, and contains all the logic to do what you want, plus automagic rolling out of old backups. I have written the following wrapper script that uses svn-hot-backup to run as a nightly cronjob to backup a single server with multiple repositories, slightly modified to be generalized.

#!/bin/bash

#
# Dumps the svn repos to a file and backs it up
# to a local directory.

#Keeps the last 10 revisions
REPODIR="/var/repos"
BAKDIR="/data/backup/svn"
PROG="/usr/local/sbin/svn-hot-backup"
REPOLIST='repo1 repo2 repo3'

if [ ! -x "${PROG}" ]
then
        echo "svnbak: Could not execute \`${PROG}\`"
        exit 1
fi

for repo in ${REPOLIST}
do
    # Dump the database to a backup file
    echo "svnbak: Dumping subversion repository:  ${repo}"
    SVN_HOTBACKUP_NUM_BACKUPS=10 nice ${PROG} --archive-type=gz ${REPODIR}/${repo} ${BAKDIR}/${repo} &> /tmp/svnbak.$$

    if [ "$?" -eq "1" ]
    then
        echo "svnbak: Hot backup on '${repo}' failed with message:"
        /bin/cat /tmp/svnbak.$$
    fi

    /bin/rm /tmp/svnbak.$$
done

exit 0

Linux – Use git for multiple server configuration files

I've used something like this before; this is how it worked.

Repo Setup

Create a git repo, "etc_files".
Create a branch for each machine type, e.g., "server/www", "server/dev", etc.
- git supports slashes in branch names. This helps me keep the branches straight in my head.
- If you have few enough machines, you could have a branch for each individual machine instead.
Create a branch for each piece of shared infrastructure, e.g. "modules/apache", "modules/cups", etc.
- These branches are for holding files that are the same between all machines, like /etc/resolv.conf. These would be the files you keep in "svn:externals" repos now.

Building a New Machine

On a new machine, clone the git repo and check out the branch for that machine type.
- I make this a read-only clone to prevent people from committing changes from production machines without testing.
Set up a cron job to automatically git pull the repo every day.

Changing Machine Branches

Changing the code in a single machine branch is simple; just git checkout the appropriate branch in your development environment, make the changes, and commit them back to the central repo. All machines in that branch will automatically get the changes the next time the cron job runs.

Changing Module Branches

Changing the code for a module branch is only slightly more tricky, as it involves two steps:

git checkout the appropriate module branch
Make your changes and commit them to the centralized server.
git checkout each machine branch which uses that module branch, and then merge the module branch into it. git will figure out that you've merged that module branch before and only notice the changes that have happened since that last common parent.

This method has both benefits and drawbacks. One benefit is that I can make a change to a module branch and apply it to the machine branches that need it, which letting the machine branches that don't stay with the older version until they're ready. The drawback, then, is that you have to remember to merge your module branch into each machine branch that might be using it. I use a script that traverses the commit tree and automatically does this merging for me, but can still be a pain.

As an alternative, newer versions of git support something called "submodules":

Submodules allow foreign repositories to be embedded within a dedicated subdirectory of the source tree, always pointed at a particular commit.

This would allow you to build something a little bit like "svn:externals" trees, which you could then update in much the same way as you do now.