The Artima Developer Community
Sponsored Link

Weblogs Forum
Pardon My French, But This Code Is C.R.A.P. (2)

36 replies on 3 pages. Most recent reply: Sep 4, 2007 7:55 AM by Alberto Savoia

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 36 replies on 3 pages [ « | 1 2 3 | » ]
James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 19, 2007 3:02 PM
Reply to this message Reply
Advertisement
> To me, crap code is code with a high degree of change
> sensitivity, or chaos. It means I have a hard time
> determining just what might be affected by changing a
> method or class. Things like global variables, thread
> locals, etc, increase the chaos, number of methods, number
> of classes, class dependency cycles, replication of
> constants, etc all contribute to this.

I would point to dependencies as the most important factor. That is, the more things that depend on something, the higher the risk you will break one of those things.

If a method or type is never used, there's no risk in changing it.

Jay Sachs

Posts: 30
Nickname: jaysachs
Registered: Jul, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 19, 2007 3:23 PM
Reply to this message Reply
> I would point to dependencies as the most important
> factor. That is, the more things that depend on
> something, the higher the risk you will break one of those
> things.

I'd agree. In large projects, I tend to restrict, if not outright prohibit, dependency cycles among packages. Small class dependency cycles are sometimes the right thing to do, but still require auditing.

Worse than complex dependencies derivable from the source are dependencies that are not staticially determinable. Consider the HttpServletRequest and its Map<String,Object> for communication. It's as if J2EE standard webapps are burdened with a "chaos bundle" from the outset.

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 19, 2007 3:37 PM
Reply to this message Reply
> > The C.R.A.P. index has one purpose, and one purpose
> only:
> > to help predict the risk, effort, and pain...
>
> Er, that would be three things. In particular measuring
> risk of change and effort of change seem very different.
> I'm not sure what is meant by pain here.

Vincent, I am sure some people think you are a bit too pedantic :-), but I actually appreciate the fact that you are bringing up some point that require clarification. My sincere thanks. No smiley here - I mean that.

In my experience as a developer and manager I have regularly experienced a strong correlation between risk, effort, and pain when maintaining other people's code. Usually, when working with other people's code is particularly painful, it is also (or feels like) more of an effort and it also feels much riskier. But, as I am sure you'd point out, correlation does not imply causality.

The C.R.A.P. index is aimed at Change Risk. The fact that more risk often means more effort, and more effort often means more pain (i.e. longer hours, stop-and-go release cycles, etc.) is secondary.

> > In that respect, our intuition, experience, and
> research,
> > indicates that complexity makes code maintenance harder
> > and automated tests make maintenance safer.
>
> Again, two things appear to be being conflated.
> Complexity and safety.

Well, conflating complexity and safety is probably wrong, but I was associating, not conflating complexity and risk. Again, I believe (based on personal experience, intuition, and other people's research) that code complexity makes code changes riskier (i.e. less safe).

> I can imagine situations where
> e making code safer might require added complexity.

We are not talking about making the code safer, we are talking about making changes to the code safer.

> > If you don't think that this is the case, please feel
> free
> > to give us suggestions and recommendations for
> improving
> > the metric.
>
> I think the only way to improve the metrics is to measure
> them for real. In the formula you gave, you implied that
> "crapness" was proportional to the square of the
> complexity and the cube of the proportion of the code not
> covered by automated tests. Do these suspiciously precise
> ratios have any empirical basis or are they just plucked
> out of the air?

I am sure you have heard of the method of successive approximations for solving equations. This approach requires an initial educated guess and then, through experience, experiments, etc., the tentative solution is adjusted over time and, hopefully, it moves closer to the real solution. Basic trial and error.

That's what we are trying to do. This is why we presented this as version 0.1, are soliciting input, and have started running in-house experiments on our code and on open source to see how we need to adjust the various parameters to better match reality. I don't think I could have been more explicit about this.

If I had come out and said: "Listen all, we have found the perfect and ultimate metric to predict code change risk and here it is!" then you'd have a point, but I presented this initial version of the C.R.A.P. index with plenty of humility and in the knowledge that it's a first effort and that we have plenty of work left to do.

> I applaud your motivation for seeking to measure code
> quality, however, I do think you need to be a lot more
> precise in defining what it is that you are seeking to
> measure and, when you present mathmatical formulae, you
> must be able to justify the terms in some way.

Again: we are not seeking to measure code QUALITY. We are in the midst of developing and testing a metric to help predict the risk associated with code change. It has nothing to do with code quality.

As far as justifying the terms goes ... considering that this is a blog and not a scientific paper, and that we have been clear that this is not a final version of the metric but a first pass, I believe that the level of justification is more than adequate.

But to be clear. We did not pull the numbers out of thin air. We studied previous research, plugged in various parameters, took code samples and ran the metric on it to see if the numbers we came up made any sense and were in the "ballpark".

What we presented as version 0.1, is the actually latest of many different formulae we have tried and the result of several months of in-house research and experimentation at Agitar Labs. Not to mention several decades of in-house research experience in academia and industry by myself, my co-founder and our research staff.

We believe it's important to make progress in the area of risks associated with code changes. Progress requires experience and experiments. Experiments require a starting point, an initial conjecture. That's what C.R.A.P. 0.1 is. Please treat it accordingly.

Alberto

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 19, 2007 3:52 PM
Reply to this message Reply
> The cyclomatic complexity of a particular method is one
> small measure here. However, the crapfulness of a codebase
> is far more than the sum of the individual crapfulnesses
> of the various methods, or classes.

Agree 100%, but we had to start somewhere :-). Since it's a good idea to try to walk before running, we decided to tackle method C.R.A.P. before Class C.R.A.P. and System C.R.A.P.

>I really think to be
> an accurate measure of the change risk, you're going to
> need to find more holistic metrics.

Stay tuned. And feel free to give input as to what metrics should be factored in. We plan to open-source the basic code for collecting a bunch of Java code metrics so you can plug-and-play your own formulae and do your own experiments.

Alberto

Derek Parnell

Posts: 22
Nickname: derekp
Registered: Feb, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 19, 2007 5:07 PM
Reply to this message Reply
> Hi Derek,
>
> > (1) Does the test coverage itself add to the complexity
> > measurement? That is to say, is it true that the more
> unit
> > tests embedded in the source then the more complex the
> > code is.
>
> Although it's true that tests have to be maintained and
> kept up to date, they don't add risk, but reduce it. So
> the test code is not included in the C.R.A.P.
> calculation.

Ok, I understand your point of view but I'm not convinced yet.

Although test cases are written in order to reduce risk, their mere existence does not automatically do that. It depends on their coverage, quality and reporting process.

Tests that don't exercise all the code and don't exercise all the requirements actually increase maintenance risk if maintainers are not aware of the existing test cases' deficiencies. And as you mentioned in another reply, a test case that doesn't tell you it failed is not much use at all.

In short, test code is code. Code needs to be maintained to meet quality requirements and functional requirements. Thus we need to reduce C.R.A.P. for test code too.

> I am not sure what you mean by "embedded in the source".
> I am not a big fan of putting the tests for a module
> e (e.g. a class in Java) in the same file, or the same
> source directory. Typically, if I have a class Foo.java,
> I have a JUnit test class called FooTest.java in a
> testsrc directory. This allows us to differentiate
> between code-under-test and tests and exclude the latter
> from the C.R.A.P. calculation.

I am not (and never will be) a Java coder. But that aside, I can sympathize with the argument that test code is better written by persons other than the module coder. However, some languages encourage the embedding of test cases with the class, function and module - and C.R.A.P. has to be able, eventually, to deal with that scenario too.

> > (2) Should 'volume' of code be incorporated in the
> > complexity measurement? The Cyclomatic complexity
> measure
> > means that if a block of code has no branches/loops
> then
> > it's measure is 1, regardless of how many actual
> > statements that block contains. But for a C.R.A.P.
> index
> > that seeks to measure the cost of maintenance, the more
> > code one has, the greater the chance of screwing up
> some
> > of it during maintenance.
>
> We considered including volume of code in addition to
> complexity. We also considered include # of parameters, #
> of fields, etc.
>
> There are many things we could include, but we decided to
> start small and add new components based on input (such as
> yours - thank you) and experience with the metric, rather
> than start with lots of components off the bat.

Fair enough. I like your approach of developing the index through refinement. I'm sure this will work and look forward to applying it to my organisation's code, not to mention my own crap ;-)

Morgan Conrad

Posts: 307
Nickname: miata71
Registered: Mar, 2006

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 19, 2007 6:16 PM
Reply to this message Reply
One can achieve 100% coverage without actually testing anything. In an extreme example, one could imagine (in Java) using reflection to create a class, automatically call every method, and never put in a assert. Not sure how to deal with this...

50% coverage with good tests that actually test the critical code in all it's options may be better than 100% coverage that doesn't really test the critical code hard enough yet covers all the trivial setFoo().

So, I'd weigh the coverage of a method by it's complexity. If code is very complex, count the coverage heavily, and factor in repeats through all the NxN paths (I forget the term, you guys probably know what it is)

If a setter method simply checks for non-null, sets a value and fires and event, coverage shouldn't matter. No complexity means dont count the coverage much.

Vincent O'Sullivan

Posts: 724
Nickname: vincent
Registered: Nov, 2002

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 12:12 AM
Reply to this message Reply
> "Cyclomatic complexity is a well-known, and widely used
> metric and it’s calculated as one plus the number of
> unique decisions in the method. For code coverage we uses
> basis path coverage."

Indeed, but what information is provided but multiplying the square of said complexity by the cube of the code test non-coverage? Why the square and the cube? Why not the other way around or some other powers? Why not include the square root of the number of global variables in the method, or the log of the number of parameters?

My point is that formula is completely arbitrary and therefore does not relate to any genuine code property.

Derek Parnell

Posts: 22
Nickname: derekp
Registered: Feb, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 2:07 AM
Reply to this message Reply
> My point is that formula is completely arbitrary and
> therefore does not relate to any genuine code property.

Although the formula, as presented so far, appears to be arbitrary, it's validity can be found by both practice and theory. In other words, let's give it a go with real-world examples and let those who can, delve into the mathematical reasoning behind the formula.

To nearly every one in the world, "E=MC2" also appears arbitrary. I mean, you multiply mass with acceleration and you get energy? What's with that? :-)

The purpose of this blog is to refine the formula to match real-world experience. Have you thought about what could be a refinement?

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 6:14 AM
Reply to this message Reply
Actually I have to question the whole premise. Forgetting that I think some people put way too much value in unit tests, how do unit tests protect you if you are changing code and not just refactoring?

That is, if I have unit tests for a method, and I change the behavior of that method, at least some of those tests will no longer be valid. How does the existence of unit tests prior to a change significantly lower the risk of that change?

Again I would have to say that in the real development, the risk comes from dependencies on what you are changing, not the complexity of a unit or how well you tested that unit in the past.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 6:16 AM
Reply to this message Reply
> Although the formula, as presented so far, appears to be
> arbitrary, it's validity can be found by both practice and
> theory. In other words, let's give it a go with real-world
> examples and let those who can, delve into the
> mathematical reasoning behind the formula.

Not to speak for Vincent, but perhaps he is asking how these numbers are derived from theory or from practice. If they were derived from practice, I would love to see the data.

Jay Sachs

Posts: 30
Nickname: jaysachs
Registered: Jul, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 8:17 AM
Reply to this message Reply
> Again I would have to say that in the real development,
> the risk comes from dependencies on what you are changing,
> not the complexity of a unit or how well you tested that
> unit in the past.

Perhaps then it's better to think of unit test coverage on client code, rather than coverage on the unit itself, as an offset to the risk in changing a unit.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 8:53 AM
Reply to this message Reply
> > Again I would have to say that in the real development,
> > the risk comes from dependencies on what you are
> changing,
> > not the complexity of a unit or how well you tested
> that
> > unit in the past.
>
> Perhaps then it's better to think of unit test coverage on
> client code, rather than coverage on the unit itself, as
> an offset to the risk in changing a unit.

Depending on the scope of the change, you can still have this issue. That is, if the point of the change is to modify the behavior of the application, tests that verify the behavior to be changed are not relevant but those that verify things that should not be changed are. I agree that this is a better way to think if it. I guess it's not clear to me what the code coverage figure in the formulas refers to. Is it code coverage across the entire application or the unit to be changed?

It seems to me that the unit tests that we care about are those that are testing the dependencies of what is being changed where what is being changed includes any dependencies with changes in behavior.

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Pardon My French, But This Code Is C.R.A.P. (2) Posted: Jul 20, 2007 2:07 PM
Reply to this message Reply
> My point is that formula is completely arbitrary and
> therefore does not relate to any genuine code property.

It is not COMPLETELY arbitrary.

Let me try to share with you a high-level (and highly summarized view) of how we've come to the current formula.

I already explained why we selected the two initial components (cyclomatic complexity (CyComp) and automated test coverage (CodeCov)) as a STARTING POINT.

We know, from other researchers (also backed by our intuition and experience) that there is a correlation between code complexity and change risk (i.e. changing the behavior of a method with CyComp of 30 is *usually* riskier than doing the same in a method with a CyComp of 10).

This means that, as complexity goes up, the change risk also goes up (again, *GENERALLY SPEAKING* - there are no claims of absolute truth applicable to every single piece of code in the existence).

So, as a first pass, for a method m:

CRAP(m) = f(CyComp(m)) where f(x) increases as x increases.

We also believe (through research, experience, and intuition) that automated tests *GENERALLY SPEAKING* reduce change risk. And that code coverage (CC) for all it’s problems, holes, and weaknesses, shows some positive correlation with the ability to detect some categories of code changes, some of which could be labeled as regressions.

So, as a second pass we have:

CRAP(m) = f’(CyComp(m), CodeCov(m)), such that f’(x,y) increases as x (complexity) increases and decreases as y (code coverage) increases.

Next we need to think how much complexity should contribute to risk. Are the chances of introducing an error in a method of complexity, say, 20 only twice as high as those of introducing an error in a method of complexity 10? Our research, intuition, and experience says no; GENERALLY SPEAKING, risk increases more than linearly with complexity. So we decide to use CyComp(m)^2. Why 2? Why not 1.4, 1.68, or 42 for that matter? Because our experience, intuition, and experiments (by looking at a bunch of actual methods along with their complexity score) all indicate that a 1 does not give enough weight to complexity and a 3 gives it too much weight. We plan to adjust that exponent based on experience, experiments, and feedback, but we decided to use that as a STARTING POINT. WE NEED A STARTING POINT for doing further experiments. (Sorry for all the “shouting” but quite a few people keep missing the point that THIS IS A FIRST PASS. WE PLAN TO REFINE THE METRIC THROUGH ITERATIONS, FEEDBACK, EXPERIMENTS, ETC.)

So now we have:

CRAP(m) = CyComp(m)^2 * f’’(CodeCov(m)),

We followed similar reasoning and conducted analogous experiments for the coverage components. Research, experience, and intuition indicate that, GENERALLY SPEAKING, test coverage provides diminishing returns. Going from no tests to one test is a big jump in confidence (especially since most tests cover the so called “happy” or most used paths), from 1 to 2 tests is a smaller jump, and so on. We looked at various studies, graphs, tables, etc. involving coverage – added our own experience and decided that the following came closest to our mental model and expectations:

(1 – CodeCov/100) ^ 3

We KNOW that the 3 is not “right”, but it’s a good enough STARTING point for the next set of approximations.

So now we have:

CRAP(m) = CyComp(m)^2 * (1 – CodeCov/100)^3

For the next step, we considered corner cases. If we have 100% coverage, should the risk go to zero? We believe not, since we know (again from research, experience, and intuition) that 100% coverage cannot guarantee that all regressions will be caught. So we added a risk component that would not be obliterated by 100% coverage:

CRAP(m) = CyComp(m)^2 * (1 – CodeCov/100)^3 + CyComp(m)

Another way to look at our job going forward is to consider a generalized version of the formula. Say:

CRAP(m) = C1 * CyComp(m)^C2 * (C3 – CodeCov/100)^C4 + C5*CyComp(m) + C6

Where we need to find the value for C1, C2, …, C6 that provide the strongest correlation and have the best predictive power.

By now you should start to get the idea. If you still choose to call this process “COMPLETELY arbitrary” instead of recognizing it as the first step in an iterative approach – a time honored technique for closing in on a solutions and for improving existing formulae – I guess we’ll never see eye to eye.


Alberto

Alotof Software

Posts: 1
Nickname: alotofsw
Registered: Jul, 2007

Re: Pardon My French, Please Speak English Posted: Jul 21, 2007 7:14 AM
Reply to this message Reply
I think the main issue is the need to talk to your developers, not invent automated systems to do that for ya. For example, couldn't agree more with Messinger's remarks on the topic (http://lgorithms.blogspot.com)
but others (i.e. Weiner) have alluded that too

-Joel

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Pardon My French, Please Speak English Posted: Jul 21, 2007 7:44 AM
Reply to this message Reply
> I think the main issue is the need to talk to your
> developers, not invent automated systems to do that for
> ya.

Hi Joel,

Of course you want to talk to your developers. The the two things are not mutually exclusive, actually some metrics can be very helpful in those talks.

In my experience the combination of human intelligence and intuition, coupled with what you call "automated system" and objective measures, gives you the best results.

When you go to a doctor for a check-up, you talk to the doctor, and s/he talks to you, s/he taps your chests, feels your glands, etc. Lots of manual touchy/feely things. But I also expect the doctor to use some "tools" and "automated systems", run blood tests, take CAT-scan, etc., and use the results (metrics, pictures) of those tests (e.g. cholesterol) to have a more complete picture of my health and make recommendations.

> not invent automated systems to do that for
> ya.

I have never implied that this or any another metric is a replacement for communicating or thinking. I actually went out of my way to say the opposite:

"We believe that software metrics, in general, are just tools. No single metric can tell the whole story; it’s just one more data point. Metrics are meant to be used by developers, not the other way around – the metric should work for you, you should not have to work for the metric. Metrics should never be an end unto themselves. Metrics are meant to help you think, not to do the thinking for you."

Flat View: This topic has 36 replies on 3 pages [ « | 1  2  3 | » ]
Topic: Pardon My French, But This Code Is C.R.A.P. (2) Previous Topic   Next Topic Topic: Python Bites (off-topic)

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use