From http://en.wikipedia.org/wiki/Cyclomatic_complexity
Les Hatton claimed recently (Keynote
at TAIC-PART 2008, Windsor, UK, Sept
2008) that McCabe Cyclomatic
Complexity has the same prediction
ability as lines of code.[11]
The ratio has about the same prediction ability as either used separately.
Regarding the formula: nodes represent states, edges represent state changes. In every program, statements bring changes in the program state. Each consecutive statement is represented by an edge, and the state of the program after (or before...) the execution of the statement is the node.
If you have a branching statement (if
for example) - then you have two nodes coming out, because the state can change in two ways.
Another way to calculate the Cyclomatic Complexity Number (CCN) is to calculate how many "regions" in the execution graph you have (where "independent region" is a circle that doesn't contain other circles). In this case the CCN will be the number of independent regions plus 1 (which would be exactly the same number as the previous formula gives you).
The CCN is used for branching coverage, or path coverage, which is the same. The CCN equals to the number of different branching paths theoretically possible in a single threaded application (that may include branches like "if x < 2 and x > 5 then
", but that should be caught by a good compiler as an unreachable code). You have to have at least that number of different test cases (can be more since some test cases might be repeating paths covered by previous ones, but not less assuming each case covers a single path). If you cannot cover a path with any possible test case - you found unreachable code (although you'll need to actually prove to yourself why it is unreachable, probably some nested x < 2 and x > 5
lurking somewhere).
As to regular expressions - of course they affect, as any other piece of code. However, the CCN of the regex construct is probably too high to cover in a single unit test, and you can assume that the regex engine has been tested, and ignore the expressions' branching potential for your testing needs (unless you're testing your regex engine, of course).
Best Answer
Cyclomatic complexity is not a measure of lines of code, but the number of independent paths through a module. Your cyclomatic complexity of 17,754 means that your application has 17,754 unique paths through it. This has a few implications, typically in terms of how difficult it is to understand and test your application. For example, the cyclomatic complexity is the number of test cases needed to achieve 100% branch coverage, assuming well-written tests.
A good starting point might be the Wikipedia article on cyclomatic complexity. It has a couple of snippits of pseudocode and some graphs that show what cyclomatic complexity is all about. If you want to know more, you could also read McCabe's paper where he defined cyclomatic complexity.
Not at all. An application with few lines of code and a high number of conditionals nested within loops could have an extremely high cyclomatic complexity. On the other hand, an application with few conditions might have a low cyclomatic complexity. That's oversimplifying it a big, but I think it gets the idea across.
Without knowing more about what your application does, it might be normal to have a higher cyclomatic complexity. I would suggest measuring cyclomatic complexity on a class or method level, however, instead of just an application level. This is a little more managable, conceptually, I think - it's easier to visualize or conceptualize the paths through a method than paths through a large application.