Git – Why does Git have tags

branchinggittagging

I've read Git branching and tagging best practices and git tagging comments – best practices, but I don't see a direct answer to something I've wondered for a long time:

Why does Git have tags? (instead of just branches)

They seem to be second-class citizens, or at least "different." They aren't pushed unless you specify that explicitly. Deletions of remote tags doesn't cause deletion in downstream repos.

This last point was a problem recently, as someone pushed a bunch of garbage tags with tons of commits from another repo. We could delete them upstream and gc the commits, but that wouldn't propogate, and the next time someone pushed a tag with git push --tags, they'd repush those garbage tags and commits. So we had to make sure everyone deleted them.

When and why would I use a tag instead of a branch?

Best Answer

This is done on purpose.

They seem to be second-class citizens, or at least "different." They aren't pushed unless you specify that explicitly. Deletions of remote tags doesn't cause deletion in downstream repos.

I don't agree with lxrec's answer about git having bad defaults. If you follow the mailing list, you can see that git developpers actually care about having sensible defaults. Would it make sense to have --ff-only as a default? I don't think so.

Tags make it possible to have annotations for your own, local development copy. I would not like to see my why_does_it_break_here and todo_fix_formatting tags being pushed without my consent (those are not actual tag names). Tagging a release, on the other hand, is something that occurs less often, and it makes sense to require an explicit push (or use an alias).

I don't see a major difference between tags and branches w.r.t. how push/fetch behaves. In your example, if the garbage tags had been branches, would the deletion propagate as you intended?

When and why would I use a tag instead of a branch?

Generally speaking:

branches are for trees: they point to different commits over time
tags are for individual commits and are immutable (this includes frozen trees such as releases)

How are branches implemented in git?

The git repository, .git mostly consists of directories with files that contain metadata that git uses. Whenever you create a branch in git, with e.g. git branch {name_of_branch}, a few things happen:

A reference is created to the local branch at: .git/refs/heads/{name_of_branch}
A history log is created for the local branch at: .git/logs/refs/heads/{name_of_branch}

That's basically it, a couple of text files are created. If you open the reference as a textfile the contents will be the id-sha of the commit the branch is pointing at. Note that branching does not require you to make any commits as they're another kind of object. Both branches and commits are "first-class citizens" in git and one way is to think about the branch-to-commit relationship as an aggregation rather than a composition. If you remove a branch, the commits will still exist as "dangling". If you accidentally removed a branch you can always try to find the commit with git-lost-found or git-fsck --lost-found and create a branch on the sha-id you find left hanging (and as long as git hasn't done any garbage collection yet).

So how does git keep track of which branch you're working on? The answer is with the .git/HEAD file, that looks sort of like this if you're on the master branch.

ref: refs/heads/master

Switching branches simply changes the reference in the .git/HEAD file, and then proceeds to change the contents of your workspace with the ones defined in the commit.

How does this compare in other version control systems?

In Subversion, branches are virtual directories in the repository. So the easiest way to branch is to do it remotely, with a one-liner svn copy {trunk-url} {branch-url} -m "Branched it!". What SVN will do is the following:

Copy the source directory, e.g. trunk, to to a target directory,
Commit the changes to finalize the copy action.

You will want to do this action remotely on the server, because making that copy locally is a linear-time operation, with files being copied and symlinked. This is a very slow operation, whereas doing it on the server is a constant time operation. Note that even when performing the branch on the sever, subversion requires a commit when branching while git does not, which is a key difference. That is one kind of overhead that makes SVN marginally less cheap than Git.

The command for switching branches in SVN, i.e. svn switch, is really the svn update in disguise. Thanks to the virtual directory concept the command is a bit more flexible in svn than in git. Sub directories in your workspace can be switched out to mirror another repository url. The closest thing would be to use git-submodule but using that is semantically quite different from branching. Unfortunately this is also a design decision that makes switching a bit slower in SVN than in Git as it has to check every workspace directory which remote-url it is mirroring. In my experience, Git is quicker to switch branches than SVN.

SVN's branching comes with a cost as it copies files and always need to be made publicly available. In git, as explained above, branches are "just references" and can be kept in your local repository and be published to your discretion. In my experience however SVN is still remarkably cheaper and more performant than e.g. ClearCase.

It's only a bummer that SVN is not decentralized. You can have multiple repositories as mirrored to some source repo but synching differing changes multiple SVN-repositories is not possible as SVN does not have uniquely identifiers for commits (git has hashed identifiers that are based on the contents of the commit). The reason why I personally started using git over SVN though is because initiating a repository is remarkably easier and cheaper in git. Conceptually in terms of software configuration management, each divergent copy of a project (clone, fork, workspace or whatever) is a "branch", and given this terminology creating a new copy in SVN is not as cheap as Git, where the latter has branches "built-in".

As another example, in Mercurial, branching started out a bit different as a DVCS and creating/destroying named branches required seperate commits. Mercurial developers implemented later in development bookmarks to mimic git's same branching model though heads are called tips and branches are bookmarks instead in mercurial terminology.

Best Answer

Related Solutions

SVN Tagging – Tags or Specify Revision?

Git – What Does ‘Branching is Free’ Mean?

How are branches implemented in git?

How does this compare in other version control systems?

Related Topic