Ny work into the application of the Halstead complexity measures to determine software quality

measurementmetricsquality

In 1977, Maurice Howard Halstead introduced his complexity measures for software systems, which included measurements of the program vocabulary, program length, volume, difficulty, effort, and an estimated number of bugs in a module. According to Wikipedia, difficulty relates to the difficulty of understanding the program when reading or writing it and effort can be translated into the time it takes to code an application where Time = (Effort / 18) seconds.

A measurement is useless unless the data and calculations relate to some aspect of software development. However, I haven't found any work which states that a difficulty of a certain value or higher tends to a statistically significant increase in defects or a relationship between difficulty and time to read code (a difficulty of N yields an average of M hours spent understanding the code base) or any analysis of being able to compute Time after the fact being useful in determining quality (especially since time to write should have been recorded as a measurement already). I'm especially interested in Halstead's bug estimation (which is not mentioned on Wikipedia) – the number of bugs in an application can be estimated by Volume/3000 or Effort^(2/3)/3000.

I'm looking for two things:

  • Has anyone used Halstead's software complexity measures in a real-world application to assess software quality? If so, how did you apply them and did they turn out to be a useful, valid, and/or reliable measurement?
  • Is there any academic research in the form of surveys, analyses, or case studies that discuss the validity (or invalidity) of Halstead complexity measures when applied to software quality?
  • Is there any academic research in the form of surveys, analyses, or case studies that demonstrate the use of Source Lines of Code (SLOC) to compute something similar to the Halstead metrics of Volume, Difficulty, Effort, Time, and Bugs? I would suspect that Volume might just correspond to a SLOC count and Difficulty might correspond to cyclomatic complexity (and possibly other measures). I'm also well aware that measuring effort, productivity, or time in SLOC is potentially misleading.

Best Answer

Microsoft Research has done some work in this area. Check out this page: http://research.microsoft.com/en-us/people/nachin/. Though not specifically based on Halstead, Nachi and his team have done some investigation using Halstead, cyclomatic complexity, code churn, and other measures to assess relative risk and fragility for making changes in areas of code. There's also an interesting paper about how organizational effectiveness also plays a big role but that's off topic. :)

Related Topic