It depends on how cleanly all these elements are separated.
In order to put clarification on how exactly i see things - i am trying to put general
As you highlighted you can deal with this in two ways -
A. Single Product Approach
The entire space is one projects and while they are fairly independent - the repo doesn't recognize it.
In this case few things can be observed.
The same release number of any RC is applied to all object products. So if the repo moves from 1.2 to 1.3 via 1.2.1 and 1.2.3 - there is a corresponding revision of each element i.e. project1.1.2.1.so as well as project2.1.2.1.so.
Version number of tags of all projects and application has the same tag.
Between two versions of a particular singular repo, whether the projectA or projectB has any changes or not, the tag keeps advancing as per total progress.
In such case, when you take a branch from any point in trunk, the entire trunk must be mirrored.
Assuming that testing has been good, all projects and application with same version/tag are always perfectly compatible with each other.
B. Independent Product approach
Here each project is an independent product -
They must have different repo (or at least separate trunks for themselves).
The tagging is limited to the projects individually each part of repo will produce end product like - projectA.1.2.1.so OR projectB.1.5.lib
In such cases - when you branch - the branch is always only for the given project and any tags before/after the merge only applies to those projects.
Each project can have independent number of tags depending on the actual number of bugs within it.
In this case each different project advances with different amount - you always need to find which set of tags - for each other works compatibly with other projects. For example - projectA-1.2 works for projectB-1.2 all the way upto projectB-2.3 but breaks after projectB-2.5 or higher. Such nested dependencies is what you need to now manage through explicitly integration testing.
So what are the pros and cons?
In a simple terms - the single product approach is clumsy because all elements are copied when a branch is created. Now if before committing if someone changes something on another person's module it becomes a conflict and that too after some significant time in future when other person decides to commit/merge. This problem essentially grows as the number of people grow in the team.
One of the primary reason why single product approach helps is the simplicity. Ideally if you have a single authority for release management - every full release is not only tested to see that pieces work individually but also that all versions work together. The integration testing is always more painful when independent products grow and you need to keep a tab on multiple versions of each different products that works together and that is the job by itself.
Support cost increases in case of multiple product situation. This is understood in this way - suppose projectB-1.2 and projectB-1.7 depends on different compatible versions of projectA-0.8 and projectA-0.9. Now, if the projectA has got a bug which could have been affecting all its' clients (irrespective of compatibility) it has to be fixed twice - both in projectA-0.8 and projectA-0.9. This doubles the effort on bug support for projectA.
Further suppose if implementations in projectA-0.8 is found to be a bug (say already solved in projectA-0.9) but you realize that solution to this bug under 0.8 will help part of its clients, but break others; in this case projectA-0.8 needs to be further branched out separately to see that now both groups are supported.
Last but most important - when people work in independent systems the communication between them reduces. They establish their work based on their individual tests; later on when the application folk needs to put in all the modules together, they have to put effort to bring pieces together, you realize a lot of issues of integration and further, if the responsibility of the bugs are not appropriately fixed based on root cause, system will tend to accumulate tech debt over time.
In a nutshell the more break-up you do in terms of a projects/modules, more of the nesting dependencies issues arise.
However this complexity is worth it if the size of the repo is too large. In a single repo, everyday people may end-up stuck dealing with conflicts or mergers.
One of the practical solution
In our place this has been a long time debate with all of us.
A simple approach we try to follow is as follows. Initially we all start with a single product approach. As we realize that as APIs become more stable, and certain key objects becomes fairly stable, we remove them slowly to another repo so that it becomes black-box for everyone. This approaches balances the initial need to remain in close communication and integration, and longer term need to keep repo manageable.
Sorry, i wrote too much?
NOTE:
From your question - i felt that your current practice and proposed one are both slightly different from how i described; however, i have used the above scenarios since they are well outlined methods so easier to follow.
The claim that "branching is free in git" is a simplification of facts because it isn't "free" per se. Looking under the hood a more correct claim would be to say that branching is redonkulously cheap instead, because branches are basically references to commits. I define "cheapness" here as the less overhead the cheaper.
Lets dig in to why Git is so "cheap" by examining what kinds of overhead it has:
How are branches implemented in git?
The git repository, .git
mostly consists of directories with files that contain metadata that git uses. Whenever you create a branch in git, with e.g. git branch {name_of_branch}
, a few things happen:
- A reference is created to the local branch at:
.git/refs/heads/{name_of_branch}
- A history log is created for the local branch at:
.git/logs/refs/heads/{name_of_branch}
That's basically it, a couple of text files are created. If you open the reference as a textfile the contents will be the id-sha of the commit the branch is pointing at. Note that branching does not require you to make any commits as they're another kind of object. Both branches and commits are "first-class citizens" in git and one way is to think about the branch-to-commit relationship as an aggregation rather than a composition. If you remove a branch, the commits will still exist as "dangling". If you accidentally removed a branch you can always try to find the commit with git-lost-found
or git-fsck --lost-found
and create a branch on the sha-id you find left hanging (and as long as git hasn't done any garbage collection yet).
So how does git keep track of which branch you're working on? The answer is with the .git/HEAD
file, that looks sort of like this if you're on the master
branch.
ref: refs/heads/master
Switching branches simply changes the reference in the .git/HEAD
file, and then proceeds to change the contents of your workspace with the ones defined in the commit.
How does this compare in other version control systems?
In Subversion, branches are virtual directories in the repository. So the easiest way to branch is to do it remotely, with a one-liner svn copy {trunk-url} {branch-url} -m "Branched it!"
. What SVN will do is the following:
- Copy the source directory, e.g.
trunk
, to to a target directory,
- Commit the changes to finalize the copy action.
You will want to do this action remotely on the server, because making that copy locally is a linear-time operation, with files being copied and symlinked. This is a very slow operation, whereas doing it on the server is a constant time operation. Note that even when performing the branch on the sever, subversion requires a commit when branching while git does not, which is a key difference. That is one kind of overhead that makes SVN marginally less cheap than Git.
The command for switching branches in SVN, i.e. svn switch
, is really the svn update
in disguise. Thanks to the virtual directory concept the command is a bit more flexible in svn than in git. Sub directories in your workspace can be switched out to mirror another repository url. The closest thing would be to use git-submodule
but using that is semantically quite different from branching. Unfortunately this is also a design decision that makes switching a bit slower in SVN than in Git as it has to check every workspace directory which remote-url it is mirroring. In my experience, Git is quicker to switch branches than SVN.
SVN's branching comes with a cost as it copies files and always need to be made publicly available. In git, as explained above, branches are "just references" and can be kept in your local repository and be published to your discretion. In my experience however SVN is still remarkably cheaper and more performant than e.g. ClearCase.
It's only a bummer that SVN is not decentralized. You can have multiple repositories as mirrored to some source repo but synching differing changes multiple SVN-repositories is not possible as SVN does not have uniquely identifiers for commits (git has hashed identifiers that are based on the contents of the commit). The reason why I personally started using git over SVN though is because initiating a repository is remarkably easier and cheaper in git. Conceptually in terms of software configuration management, each divergent copy of a project (clone, fork, workspace or whatever) is a "branch", and given this terminology creating a new copy in SVN is not as cheap as Git, where the latter has branches "built-in".
As another example, in Mercurial, branching started out a bit different as a DVCS and creating/destroying named branches required seperate commits. Mercurial developers implemented later in development bookmarks to mimic git's same branching model though heads
are called tips
and branches
are bookmarks
instead in mercurial terminology.
Best Answer
This is done on purpose.
I don't agree with lxrec's answer about git having bad defaults. If you follow the mailing list, you can see that git developpers actually care about having sensible defaults. Would it make sense to have
--ff-only
as a default? I don't think so.Tags make it possible to have annotations for your own, local development copy. I would not like to see my
why_does_it_break_here
andtodo_fix_formatting
tags being pushed without my consent (those are not actual tag names). Tagging a release, on the other hand, is something that occurs less often, and it makes sense to require an explicit push (or use an alias).I don't see a major difference between tags and branches w.r.t. how push/fetch behaves. In your example, if the garbage tags had been branches, would the deletion propagate as you intended?
Generally speaking: