ClearCase advantages/disadvantages

clearcaseversion control

Because I'm currently struggling to learn IBM Rational ClearCase, I'd like to hear your professional opinion.

I'm particularly interested in advantages/disadvantages compared to other version-control-systems like Subversion or Git.

Best Answer

You can find a good comparison between ClearCase and Git in my SO answer:
"What are the basic ClearCase concepts every developer should know?", illustrating some major differences (and some shortcomings of ClearCase)

File-centric operations

The most single important shortcoming of ClearCase is its old "file-centric" approach (as opposed to "repository-centric" like in SVN or Git or Perforce...)
That means each checkout or check-in is done file per file. The atomicity of operation is at file levels.
Combine that with a very verbose protocol and a network with potentially several nodes between the developer workstation and the VOB server, and you can end up with a fairly slow and inefficient file server (which ClearCase is at its core).

File-per-file operations means: slow recursive operations (like recursive checkout or recursive "add to source control", even by clearfsimport).
A fast LAN is mandatory to mitigate the side-effects of that chatty protocol.

Centralized VCS

The other aspect to take into account is its centralized aspect (even though it can be "distributed" with its multi-site replicated VOB feature)
If the network does not allow access to the VOBs, the developers can:

  • still work within snapshot views (but with hijacked files only)
  • wait for the restoration of the network if they are using dynamic views

Expensive Distributed VCS option

You can have some distributed VCS feature by replicating a Vob.

  • you need a special kind of license to access it.
  • that license is expensive and add to the cost of the regular license
  • any vob that uses the replicated vob (admin vob, admin pvob, ...) must be replicated as well (meaning some projects not directly concerned with a distributed development will still have to pay multi-site license...)

Old and not user-friendly GUI

  • the GUI is very old school and impractical (mid-90's MFC look, completely synchronous GUI, meaning you have to wait for a refresh before clicking elsewhere): when browsing baselines, you cannot quickly look for one in particular.
  • the GUI on Unix is not exactly the same than on Windows (the latest 7.1 version is better but not there yet)
  • the installation process is quite complicated (although the latest Installer Manager introduced by CC7.1 is now a coherent GUI on Windows or Unix and does simplify the procedure)
  • the only real rich application has only been developed for CCRC (the Remote Client)

UCM inconsistencies and in coherencies

As mentioned in "How to Leverage ClearCase’s features", dynamic views are great (a way to see data through the network without having to copy them to the disk), but the main feature remain UCM: it can be a real asset if you have big project with complex workflow.

Some shortcomings on that front:

Limited policies with Base ClearCase

Using ClearCase without using UCM means having to define a policy to:

  • create branch (otherwise anyone can create any branch, and you end up with a gazillon of them, with merge workflow nightmare)
  • put labels (otherwise you forget to label some files, or you put a label where you were not supposed to, or you "move" (gasp) a label from one version to another: at least UCM baselines cannot be moved)
  • define changeset. ChangeSets only exist with UCM activities. With Base ClearCase, you are reduced to clever "cleartool find" requests...

No application rights

ClearCase right management is entirely built on system rights.
That means you need to register your user to the correct system group, which is not always easy to do when you have to enter a ticket to your IT service in order for them to make the proper registration.
Add to that an heterogeneous environment (users on Windows, and server on Unix), and you need to register your user on Unix as well as Windows! (with the same login/group name). Unless you put some sort of LDAP correspondence between the two world (like Centrify)

No advanced API

  • only CLI is complete ("cleartool" is the ClearCase Command Line Interface), meaning that any script (in Perl or other language) consists in parsing the output of those cleartool commands)
  • ClearCase Automation Library (CAL) exists, but is quite limited
  • Java API exists, but only for web views for the CCRC client.

View Storages not easily centralized/backed up

The View storages are the equivalent of the ".svn" of SubVersion, exept there is only one "view storage" per view instead of many .svn in all the directories of a SubVersion workspace. That is good.

What is bad is that each operations within a view (a simple "ls", checkout, checking, ...) will trigger a network request to the view_server process that manages your view server.
2 options:

  • declare your view storage on your workstation: great for scalability, you can have as many view as you want without taxing the LAN: all communications are directly done on your workstation. BUT if that machine dies on you, you loose your views.
  • declare your view storage on a centralized server: that means all view_server process will be created there and that all operations on a view by any user will have to communicate with that server. It can be done if the infrastructure is "right" (special high-speed LAN, dedicated server, constant monitoring), but in practice, your LAN will not support this mode.

The first mode means: you have to backup yourself your work in progress (private files or checked-out files)
The second mode means: your workstation can be unavailable, you can just log on another a get back your views (execpt for the private files of a snapshot view)

Side discussion about dynamic views:

To add to the "dynamic view" aspect, it has one advantage (it's dynamic) and one shortcoming (it's dynamic).
Dynamic views are great for setting a simple environment to quickly share a small development between a small team: for a small development effort, a dynamic view can help 2 or 3 developers to constantly stay in touch one with another, seeing instantly when one's commit breaks something in the other views.
For more complex development effort, the artificial "isolation" provided by snapshot view is preferable (you see changes only when you refresh - or "update" - your snapshot view)
For real divergent development effort or course, a branch is still required to achieve true code isolation (merges will be required at some point, which ClearCase handles very well, albeit slowly, file-by-file)

The point is, you can use both, for the right reasons.

Note: by small team I do not mean "small project". ClearCase is best used for large project, but if you want to use dynamic views, you need to setup up "task branches in order to isolate a small development effort per branch: that way a "small team" (a subset of your large team) can work efficiently, sharing quickly its work between its members.
If you use dynamic views on a "main" branch where everyone is doing anything, then any check-in would "kill you" as it could introduced some "build breaks" unrelated with your current development effort.
That would then be a poor usage of dynamic views, and that would forget its other usages:

  • additional way of accessing data, in addition of snapshot views, meaning it is a great tool to just "see" the files (you can for example use a dynamic view to tweak its config spec until you see what you want and then copy those select rules into your usual snapshot view)
  • a side view to make merges: you work with your snapshot view, but for merges you can use your dynamic "sister-view" ("sister" as in "same config spec"), in order to avoid having a failed merge because of checked-out files (on which you would be currently working on your snapshot view), or because of a snapshot view not completely up-to-date. Once the merge is complete, you update your regular snapshot view and resume your work.

Developing directly in a dynamic view is not always the best option since all (non-checked-out) files are read over the network.
That means the dll or jar or exe needed by your IDE would be accessed over the network, which can slow down considerably the compilation process.
Possible solutions:

  • one snapshot view with all in it
  • a snapshot view with dll or jar or exe in it (files which do not changes every five minutes: one update per day), and dynamic view with only the sources visible.