At the end of the evening, I went to dinner with some of the other speakers and had an opportunity to meet and spend some quality time talking about CRAP and other software metrics with Joel and Eric.
Joel is not a big fan of software metrics in general. He is concerned that developers might end up writing code and allocating their time to satisfy a specific metric rather than writing the best possible code and allocating the time based on more important criteria. He narrated a couple of stories about horrific metrics misuse that he witnessed first-hand and was concerned that - in the wrong hands - the CRAP metric could be used in, say, performance reviews: You code is too crappy. You're fired!
I understand that there is potential, as well as some evidence, for software metrics misuse; but I don't think that's sufficient reason for avoiding metrics altogether. My reply to Joel was that if an organization/manager is so lazy and stupid to rely exclusively on any given code metric in evaluating programmers, then those programmers are probably better off being fired from that organization anyway. Better yet, the programmers would have great evidence to have the moronic manager fired.
While I understand that any tool, technology, or information can be abused by evil people and misused by stupid ones, I don't think we should use "How could this be abused or misused?" as the primary criteria - at least not without first balancing it against the potential benefits.
Toward the end of the conversation, Eric Sink observed that the argument was starting to sound a lot like the perennial "Guns don't kill people, people kill people" discussion. Great observation Eric.
What do you all think about software metrics?
Aren't you a bit surprised that, despite the fact that software runs the world and that we spend hundreds of billions a year writing and maintaining software, there isn't a single industry-wide metric that's being used with any consistency. Why is that?
Are all software metrics inherently evil and useless?
Is programming so much more art than science or engineering that it's pointless to try to quantify or evaluate code using objective criteria?
What do you think? Do you have any software metric horror/success stories to share?
The thing to remember about metrics is that they are about measuring something, and as Deming himself noted, the most important numbers are those you can't measure.
Yes, software is used pervasively. That's a GOOD reason for not having common metrics. Even the same metric, when the measurement can be supplied, doesn't mean the same thing in different contexts. We measure speed for both cars and airplanes. But we don't expect the same speed out of a Mercedes as a 767 - or a helicopter. And airplanes have metrics that don't apply to cars, like rate-of-climb, service ceiling; likewise cars are measured for stopping distance, which differs in meaning from a landing requirement.
And it's obvious that "fast" has an almost qualitatively different meaning when applied to alligators . . . :-)
Demoing Clover around the world, people often ask me "what do you think is a good coverage %?"
I think the answer applies to most software metrics in most situations. That is, the number itself doesn't matter as much as the trend. I usually say something along the lines of "I can't tell you how well you should test your software without knowing a whole bunch about your project, but it is a bloody good idea to at least ensure you don't make it worse." depending how much beer I've drunk and how long I've been booth monkeying for ...
Anyway, two points:
a) metrics are useful, but trying to compare to some "industry wide" measure is probably not very useful and could certainly be dangerous.
b) metrics over time tell you a lot more about your team than at any moment in time.
 unless you are a consultant selling snake oil, in that case metrics are great ;)
I do not think Software Metrics Kill the projects. The software metrics is more to have a believe/confident on the code someone developed.
You can start writing a function, but what is the use, if it serves as a prototype only. To move application to real time, we need code quality checks and we need to follow the metrics.
On the same note I will disagree, to have standardized metrics to apply for all kind of applications. The Metrics should be based on individual organization critical areas and should be identified for each kind of application.
> <p>Aren't you a bit surprised that, despite the fact that software runs the world and that we spend hundreds of billions a year writing and maintaining software, there isn't a single industry-wide metric that's being used with any consistency. </p>
<p></p> <p>Aren't you a bit surprised that, despite the fact that software runs the world and that we spend hundreds of billions a year writing and maintaining software, there isn't a single <em>programming language</em> that's being used with any consistency.</p>
In the USA, the most precisely measured people are the most handsomely rewarded. Sports & other entertainment folks.
Technical folks are "above" that sort of thing, leaving executives to measure our performance by: high salaries, bug counts, missed dates, incomplete features... Leaving execs to realize: Hey! I can buy the last 4 for 1/16 the price overseas!
There isn't a metric that will satisfy everyone, but there *has* to be some way for your group/dept to measure itself. Otherwise, how do you know you're doing things better?
It is in our interest to develop meaningful metrics.
> Joel is not a big fan of software metrics in general. > He is concerned that developers might end up writing code > and allocating their time to satisfy a specific metric > rather than writing the best possible code and allocating > the time based on more important criteria. He narrated a > couple of stories about horrific metrics misuse that he > witnessed first-hand and was concerned that - in the wrong > hands - the CRAP metric could be used in, say, performance > reviews: You code is too crappy. You're > fired!
It's funny that Joel reacts to the CRAP metric in that way. When I read about the new feature in FogBugz - Evidence-Based Scheduling - one of my reactions was: What if management decides to use the computed velocity in performance reviews?
David, I agree with your conclusion but not totally with how you arrived at it.
> In the USA, the most precisely measured people are the > most handsomely rewarded. Sports & other entertainment > folks.
Assembly line workers are precisely measured and the disparity between their pay and a top athlete's pay is large. With the Film sub-industry, your ability to get callbacks is your single metric of importance in your asking price. If 2 or 3 producers want Claire Danes for their films simultaneously, she's in a position to ask for a higher percentage of gross revenues. Our industry flooded the market with labor in the latter 90's, effectively stagnating pay levels. Furthermore, executives started questioning the worth of our product to their organizations post-2K. This drove the search for cheaper labor.
> Technical folks are "above" that sort of thing, leaving > executives to measure our performance by: high salaries, > bug counts, missed dates, incomplete features... Leaving > execs to realize: Hey! I can buy the last 4 for 1/16 the > price overseas!
Professional baseball has a large contingent of South American and Asian players. Yet, pays are still considerably higher than in our profession. It gets back to the value executives assign to our products in their organizations.
> There isn't a metric that will satisfy everyone, but there > *has* to be some way for your group/dept to measure > itself. Otherwise, how do you know you're doing things > better? > > It is in our interest to develop meaningful metrics.
Completely agreed here. To realize a greater pay scale, we need metrics for how our products contribute to the organization. For example, if a software system you and I develop allows Sales to go from lead to sale quicker and cheaper, we should share in the revenue. Find a way of measuring it and you are on the path to greater pay.
Metrics that point out use of bad coding practices could be a good thing. But it wouldn't measure "code quality" -- as that's largely based on the quality of the design.
I don't think we'll ever have an objective, quantitative measure for design quality any moreso than we can objectively measure the quality of a movie or novel. At best, a metric might be able to compare the design quality of three different same-technology implementations of the same functionality, but who would bother implementing the same functionality three times in the same technology?
If you try to compare the ugliness of applications with different functionality, how could a program separate the simplicity of the design from the simplicity of the requirements?
>That's a GOOD reason > for > not having common metrics. Even the same metric, when the > > measurement can be supplied, doesn't mean the same thing > in > different contexts. We measure speed for both cars and > airplanes. But we don't expect the same speed out of a > Mercedes as a 767 - or a helicopter. And airplanes have > metrics that don't apply to cars, like rate-of-climb, > service ceiling; likewise cars are measured for stopping > distance, which differs in meaning from a landing > requirement.
I agree that context is important and that it may make no sense to compare the speed of cars to that of airplanes (or alligators). But top (or cruising) speed is used to compare airplanes in the same category (e.g. commercial airliners, jet fighters) against each other. I also don't expect to use MPG to compare a Ferrari to a Prius, but I do expect to use MPG to compare a Prius to other fuel-efficient cars.
I am not arguing that all software should be held to the same standard wrt any given metric. For example, I expect that medical applications to be more thoroughly tested than, say, a video game. But I am surprised that we don't have any metric that's used with any consistency to compare applications in the same category.
In my opinnion the biggest mistake one can do with metrics is to try to figure out what a single measurement result means in some universal "how good is my code" context. Measuring for measurement's sake is (almost) insane.
In order to really benefit from metrics they should be used as a part of wider QA plan with defined goals. Measure repeatedly and compare the results, not with some external magic values, but with each other, in order to see the direction the product/process is going to. Metrics don't tell so much about wether things are good or bad but more about wether something is better or worse.
It is not the easiest of tasks to set up a proper mesurement process. But in my opinnion if you don't do it the hard way, it is better not to do it at all.
Some metrics are temporal. If your team is handed a code base that contains 20% duplicated code, you may wish to measure a reduction in duplicated code until its irrelevant.
Same for the # of unit tests, class count, LOC, whatever...
If a metric remains useful over a long period of time, so be it. If not, just as well.
I would think that if a group had a history of unit tests, automated source analyzers, low bug counts, successful at meeting deadlines, AND worked w/ execs on this, it would be easier to challenge out-sourcing. Yes, you're paid big bucks, but does the competition have data that stacks up well with ours? Now the execs have to add in the cost of not having that history/work processes or having to re-create it themselves... This quickly becomes a high-risk proposition. I think most execs are looking for stability in their organizations, not wildcards.
The lack of metrics and the ability to equate them to our business value is our own fault.
There probably is truth in metrics and metrics should be a powerful tool, but in reality there are many factors influencing the metrics we gather. When we want to evaluate a process using statistics we hope that measurable events are independent. I have already heard some discussion on the effects of design on code quality. Many other factors could be listed: maintenance vs green field, embedded vs classic application vs web-based application, language used, schedule constraints etc. If your shop supports one project or one type of project, you should get more benefit from metrics. I think Joel's perspective is evident. He owns and manages a small shop and he can directly influence quality by hiring the people he wants and directing many key factors that effect quality. And nothing still beats the classic "management by walking around". On the other end of the spectrum is management by metrics. Metrics from several varying projects are rolled up together and fed upstairs to managers who make key decisions. This is obviously ridiculous.
Flat View: This topic has 50 replies
on 4 pages