> Cem, > > I believe that software metrics can, and often are, abuse, > misused, overused, confused, etc. And that A LOT more > work, experiments, and research is needed to improve the > current state of affairs. Not to mention educating people > in the proper ways to use (or not use) metrics. > > But if I interpret your post (and position) correctly, the > only conclusion I can draw is that - as of today - we > should not be using ANY metric. Zero, nada, nyet. Is that > the case? Should we burn any and all metrics tool? > Remove all code coverage from our IDEs?
Sure, I want to achieve 100% statement/branch coverage of code that I write. Before I had tools, I had to use a yellow highlighter on a code listing (really, we did this at Telenova, a phone company I programmed at for 5 years in the 1980's). It was valuable, but it was very tedious. Having simple coverage monitors in Eclipse makes my life easier.
But I think this is the beginning of testing, not the end of it. And in finite time, I am perfectly willing to trade off some other type of testing against this. I see this as a tool for me as a programmer, not as a management tool. As soon as it turns into a management tool, I want to burn it because of the side effects.
If I was assessing someone's code (as their manager), there are several questions that I'd want to balance, such as:
- does it work? - what evidence has this person collected to suggest that it works, or works well enough? - is it a straightforward implementation? - can I understand the code? - is it usable (either at the UI level or in its interface to the relevant other parts of the application)? - how much did this cost and why? - did s/he consider implementation cost and quality explicitly in the design and implementation? What evidence?
Code coverage (take your pick of metrics) is a tiny part of this picture. The more I focus on it, the more tightly I am hugging one tree in a big forest.
One approach is to combine several simplistic metrics (hug a few trees), but just as coverage is a terribly weak indicator of how well the code has been tested, many of these other metrics are weak indicators of whatever they are supposed to measure. Combining them gives an appearance of much greater strength (dashboards or balanced scorecards are very impressive) but they still provide very little information against the questions I'm asking.
The questions are much more critical than the metrics.
It is common to teach an approach to measurement called Goal / Question / Metric. You define a measurement goal ("I want to understand the productivity of my staff in order to manage my project's costs and schedule better") and then a few questions ("What is the productivity of my staff?" "How would a change in productivity impact my costs?") and then a metric or two per question.
One of the exercises we do in my metrics class is to pick a question and take it seriously. Suppose we really wanted an answer to the question. What kinds of information would we have to collect to get that answer? We often come up with lists of several dozen candidates, some of which are easy to translate to numbers and others that need a more qualitative assessment. It is so very tempting to pick one or two easy ones to calculate, declare that these are a sufficient sample of the space of relevant metrics, and then manage on these. And that temptation is so very dangerous in terms of the side effects.
> > Are you suggesting a full moratorium on all metrics until > we have invested a few years in "longer term research that > might provide much more useful answers in the future."?
I am not suggesting a moratorium on management. I am suggesting a moratorium on the hype. I am suggesting a huge increase in the humility index associated with the statistics we collect from our development and testing efforts. I am suggesting a fundamental refocusing on the questions we are trying to answer rather than the statistics we can easily compute that maybe answer maybe some of the questions maybe to some unknown degree with some unconsidered risk of side effects. I am suggesting that we take the risks of side effects more seriously and consider them more explicitly and manage them more thoughtfully. And I am saying that we demand research that is much more focused on the construct and predictive validity of proposed metrics, with stronger empirical evidence--this is hard, but it is hard in every field.
> P.S. Cem, while we might hold different opinions on how to > improve/fix the state of software metrics, I believe we > share several common goals. I have enormous respect for > you , your work, and your passion for software quality and > testing (which we share.) Not to mention the fact that I > really like you on a personal level :-). I hope that this > post is interpreted in the spirit in which it was written > (i.e. a true desire to confirm my understanding of your > position, not poke fun at it) and that we can continue > this discussion in a constructive way that will help us > (and the readers) gain a better understanding of different > positions.
Alberto, I wrote my last note (the one on your blog post that follows up to this post), speaking to you by name, because I respect you enough and like you enough to say that I'm disappointed. I've spent a lot of writing hours on this thread--usually I skip blog posts on what I think of as overly simplistic approaches to software measurement, but I put a lot of time into this one because it is your thread. That makes it worth my attention.
Flat View: This topic has 50 replies
on 51 pages