Microsoft Research has done some work in this area. Check out this page: http://research.microsoft.com/en-us/people/nachin/. Though not specifically based on Halstead, Nachi and his team have done some investigation using Halstead, cyclomatic complexity, code churn, and other measures to assess relative risk and fragility for making changes in areas of code. There's also an interesting paper about how organizational effectiveness also plays a big role but that's off topic. :)
We used to do manual code reviews (i.e. no special tools) and this was the best method we found. Our development department doesn't have team foundation server, so I can't comment on that.
- All work is tied to a bug or an agile story that has a unique ID. When the work is checked in (or shelved, which is submission without actual check in), the description will always have this identifier.
- In a stand-up or over e-mail the reviewer would be notified that bug ### or story ### is ready for review.
- They do source control diff between new versions of the files and previous versions and copy/paste the output of diff into a word document. We have a code review word document template which is preset to use have 2 levels of headings: level 1-module, level 2-file name. It also defines formatting style for actual code (8pt Consolas). This step obviously was the biggest pain, but with Word template and Ctrl-A, Ctrl-C, Ctrl-V sequence, it's not that many clicks.
- Now that the document is in word, the reviewer can highlight the code in question and add comments using Word's review system.
- All documents are stored in a document repository and labeled with corresponding bug or story number.
- Once the document is done, the link to the repository is shared with original developer. Then we'd either have a code review meeting if changes were significant enough and need discussion, or just leave it up to original developer to go through comments on his own time.
After trying numerous manual methods, this was the one we found to have the least amount of overhead while allowing us to actually review every change.
However, our engineering team just rolled out Review Board and although I haven't used it that much yet, so far I'm loving what I have seen. It has all the flexibility of our word docs, but no more copy/pasting and manually fixing formatting. As an added bonus it keeps a permanent archive of all comments so you can go back years in time if you ever need to. Also it allows you do diffs of diffs, which is great when you want to do a review of a code review. This part we found to be very difficult with manual procedures because you can't see what was changed in response to first code review, instead you end up redoing the entire thing.
Although you did say you don't want to use tools, I'd strongly urge you to consider Review Board. It is open source and completely free. So you can roll it out for yourself and the 5 people you maybe working with. The rest of your company doesn't have to use this tool if they don't want to. And you don't need to worry about getting any purchase approvals.
== Update to comments from question: ==
On my team, we have people in NY, CT, TX, Poland and India. What makes things even more interesting is that extremely high percentage of the team doesn't know the product or technology all that well, so very few of us do most of the review. So yeah, senior devs are definitely busy. In the process I outlined, primary reviewer would do initial walk through the code independently on his own schedule. But afterwards we schedule a meeting and primary reviewer walks the coder through his comments. The meeting can have other reviewers, but they are considered secondary and are not obligated (but not discouraged from) to review every file or make comments.
I agree with others comments that having the final meeting in real-time, even if you have to use web conferencing is much better for knowledge transfer and for helping your new guys understand the code so that they'll start producing things that make your senior devs even busier. But again, that depends on the volume and type of comments, sometimes (very infrequently) comments are so minor that the meeting is skipped.
I can also relate to you when you say it's hard to roll out new tools. I work for a very large corporation and typically there's so many people involved that even outside of the fact that no purchases are ever approved, there are so many interests/agendas that nothing is ever agreed on. What's nice about Review Board is that you can skip all that and just start using with your small team and if you have to (if it really comes to that), you can host the web services on your own dev machine.
Best Answer
There are a number of metrics that can be gathered from code reviews, some even extending throughout the lifecycle of the project.
The first metric that I would recommend gathering is defect removal effectiveness (DRE). For every defect, you identify what phase the defect was introduced in and what phase it was removed in. The various defect detection techniques that you use are all assessed simultaneously, so it applies equally to requirements reviews, design reviews, code reviews, unit tests, and so on. You would be particularly interested in the number of defects caught in the code phase, since this would probably encompass your unit tests as well as code reviews. If many defects from the code phase are making it through to the integration test phase or even the field, you know that you post-coding practices should be evaluated.
Various meeting metrics would also be relevant. These include the time to prepare, time in meeting, lines of code read, defects found in the review, and so on. Some observations can be made from this data. As an example would be if your reviewers are spending a large amount of time reading the code in preparation for the review, but finding very few problems. Coupled with the DRE data, you can draw the conclusion that if defects are being tested in integration testing or the field, then your team needs to focus on their review techniques to be able to find problems. Another interesting note would be the lines of code (or some other size measurement) read in a meeting compared to the time of the meeting. Research has found that the speed of a typical code review is 150 lines of code per hour.
With any of these metrics, it's then important to understand their impact on the process. Root cause analysis, using techniques such as why-because, Five Whys, or Ishikawa diagrams can be used to identify the reasons why code reviews (or any other quality improvement technique) are (in)effective.
You might also be interested in this article about inspections from The Ganssle Group and an article by Capers Jones in Crosstalk about Defect Potentials and DRE.