Linux – Use git for multiple server configuration files

configurationgitlinuxsvn

We have migrated a lot of source code over to git and are very happy with our current solution. We would like to have our server configuration files versioned on the same system, but there are a few things that don't work the way we would like it to and I hope someone can share his experience here.

This question is similar to Using revision control for server configuration files?, but we have some special requirements that do not work with the suggestions on that question.

The current setup uses subversion for configuration files. The corresponding repository looks something like this

 / # root of repository
 +--www.domain.com/     # configuration for www
 |  \--etc/
 |     \--apache2/
 +--dev.domain.com/     # configuration for dev
 |  +--etc/
 |  \--opt/
 |     \--app1/         
 |         \--conf/     # configuration for app1 on dev
 \--staging.domain.com/ # configuration for staging

With subversion this would work just fine, because it's possible to just checkout a sub-directory of a repository. In addition you can use svn:externals to point to one common structure for several different configuration setups. We only had to deal with the .svn files in all versioned directories. Git on the other hand does not have svn:externals and sparse checkouts always require the path from the root to the actual directory to be the same.

When discussing the migration to git, I tried to write down the main Requirements for the server configuration versioning:

we only want a single repository
it should be possible to easily push changes to the central remote
changesets should contain the real author

Is there a nice way to have all the configuration in one repository and only have a sub-path as working copy? Currently I am considering two approaches, but wanted to ask this question here first

If the .git repository is at a fixed location, e.g. somewhere in /var, we could link to the sub-path from the "target" working directory. The main problem: I would not know of a way to "link" from /etc to another directory in order to only import the contents, except symlinking single files
I found another alternative on this SO question, suggesting to have multiple branches in one repository. This would certainly increase complexity, but I could see us trying this way.

Using git on a single machine for configuration file management works fine, but I believe there must be someone who is using it the way we would like to use it.

Thank you
Kariem

Best Answer

I've used something like this before; this is how it worked.

Repo Setup

Create a git repo, "etc_files".
Create a branch for each machine type, e.g., "server/www", "server/dev", etc.
- git supports slashes in branch names. This helps me keep the branches straight in my head.
- If you have few enough machines, you could have a branch for each individual machine instead.
Create a branch for each piece of shared infrastructure, e.g. "modules/apache", "modules/cups", etc.
- These branches are for holding files that are the same between all machines, like /etc/resolv.conf. These would be the files you keep in "svn:externals" repos now.

Building a New Machine

On a new machine, clone the git repo and check out the branch for that machine type.
- I make this a read-only clone to prevent people from committing changes from production machines without testing.
Set up a cron job to automatically git pull the repo every day.

Changing Machine Branches

Changing the code in a single machine branch is simple; just git checkout the appropriate branch in your development environment, make the changes, and commit them back to the central repo. All machines in that branch will automatically get the changes the next time the cron job runs.

Changing Module Branches

Changing the code for a module branch is only slightly more tricky, as it involves two steps:

git checkout the appropriate module branch
Make your changes and commit them to the centralized server.
git checkout each machine branch which uses that module branch, and then merge the module branch into it. git will figure out that you've merged that module branch before and only notice the changes that have happened since that last common parent.

This method has both benefits and drawbacks. One benefit is that I can make a change to a module branch and apply it to the machine branches that need it, which letting the machine branches that don't stay with the older version until they're ready. The drawback, then, is that you have to remember to merge your module branch into each machine branch that might be using it. I use a script that traverses the commit tree and automatically does this merging for me, but can still be a pain.

As an alternative, newer versions of git support something called "submodules":

Submodules allow foreign repositories to be embedded within a dedicated subdirectory of the source tree, always pointed at a particular commit.

This would allow you to build something a little bit like "svn:externals" trees, which you could then update in much the same way as you do now.

Related Solutions

GIt Daemon and Access Control for Multiple Repos

Check out gitosis which is a git repository hosting application. Quoting the description from the Debian package of gitosis:

This package aims to make hosting git repositories easier and safer.
It manages multiple repositories under one user account, using SSH
keys to identify users. End users do not need shell accounts on the
server; they will talk to one shared account that will not let them
run arbitrary commands.

You can find the gitosis source at http://eagain.net/gitweb/?p=gitosis.git

Documentation on how to set it up: http://scie.nti.st/2007/11/14/hosting-git-repositories-the-easy-and-secure-way

I'm very happy with gitosis, we're using it at the grml project (http://grml.org/) with more than 100 repositories and it works fine without any problems.

Linux – Version-control of configuration files with different home names

Arguably, you're trying to use a revision control system to do configuration management. Those are two very different kinds of activities. I think the approach you're taking won't scale well at all, and will lead to significant amounts of frustration. You would be better served to find decide upon a configuration management solution (there's quite a few well matured options out there: cfengine, puppet, chef, etc), and then have your configuration management solution pull its recipes and/or configuration files from the revision control system. It will give you the same net effect that you seem to be looking for, but will have the right tools working on the appropriate parts of the problem.