Weblogs Forum - Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Weblogs Forum
Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

50 replies on 4 pages. Most recent reply: May 16, 2008 1:38 PM by Thomas Cagley

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 50 replies on 4 pages [ « | 1 2 3 4 ]

Robert Evans

Posts: 11
Nickname: bobevans
Registered: Jun, 2003

Re: Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Posted: Dec 19, 2007 1:44 AM

Reply

Advertisement

This was supposed to be a reply to Cem's post. D'oh, no threading.

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Posted: Dec 19, 2007 11:48 AM

Reply

> I think the biggest problem with software metrics is that
> we don't have any.
...
>
> It takes years of work to develop valid measurement
> systems. We are impatient. In our impatience, we too often
> fund people (some of them charlatans) who push unvalidated
> tools instead of investing in longer term research that
> might provide much more useful answers in the future.

Cem,

I believe that software metrics can, and often are, abuse, misused, overused, confused, etc. And that A LOT more work, experiments, and research is needed to improve the current state of affairs. Not to mention educating people in the proper ways to use (or not use) metrics.

But if I interpret your post (and position) correctly, the only conclusion I can draw is that - as of today - we should not be using ANY metric. Zero, nada, nyet. Is that the case? Should we burn any and all metrics tool? Remove all code coverage from our IDEs?

Are you suggesting a full moratorium on all metrics until we have invested a few years in "longer term research that might provide much more useful answers in the future."?

And, given that the overwhelming majority of metrics in existence today (all of which are invalid in your opinion) already come from researchers and academics, how can we make sure that THIS TIME we fund the right researchers and academics?

When you look at the body of work in software metrics, you see a bunch of charlatans and incompetent theorists, and consultants who are trying to swindle a bunch of clueless managers who, in turn, are going to abuse their poor programmers with those metrics. I see it a little differently...

There may be some bad apples (as in any field). But for the most part, I see a bunch of people, many of them very smart, who are motivated by a deep desire to understand and improve the way we design, write and test software. This is a very difficult task, made considerably more arduous by the constantly changing environment (i.e. every few years there are new programming models, languages, styles, etc.) Most of these people are smart enough to realize, and make it clear, that the metrics they are proposing and experimenting with are nowhere near perfect and that no single metric (or even a set of metrics) can tell the whole story. But that does not stop them from experimenting and using those metrics to learn more about them and, frankly, how else are you going to learn more and improve something if you don't experiment with it.

>When we put defective tools in the hands of executives
>and managers, it's like putting a loaded gun in the
>hands of a three-year old and later saying, "guns
>don't kill people, people kill people." By all means,
>blame the victim.

I find this attitude toward executives and managers surprisingly insulting, patronizing, and a gross over generalization. There are, for certain, some Dilbertesque managers and executives who will misunderstand and misuse metrics (that's the group that my YouTube video on "Metrics-Based Software Management" pokes fun at). But, based on my experience, most of them have enough sense to see metrics for what they are: a tool that, properly used will give them and their team some valuable (if not complete, perfect, or infallible) insight.

I have a hard time believing that you would hold such extreme positions; but I re-read your post several times and the only conclusion I can draw is that in your view:

i) As of today, there are ZERO metrics that meet your standard/definition for construct validity.

ii) Putting invalid metrics in the hands of managers and executives is like putting guns in the hands of three-year olds (who will then aim them at innocent developers).

iii) Therefore, we should not use ANY software metrics AT ALL until a group of enlightened researchers (which will probably exclude all the charlatans and incompetent nincompoops responsible for the current crop of metrics) has had sufficient time to perform experiments in a protected environment and might come up with some metrics safe for general use sometime in the future.

Is that right?

Alberto

P.S. Cem, while we might hold different opinions on how to improve/fix the state of software metrics, I believe we share several common goals. I have enormous respect for you , your work, and your passion for software quality and testing (which we share.) Not to mention the fact that I really like you on a personal level :-). I hope that this post is interpreted in the spirit in which it was written (i.e. a true desire to confirm my understanding of your position, not poke fun at it) and that we can continue this discussion in a constructive way that will help us (and the readers) gain a better understanding of different positions.

Cem Kaner

Posts: 4
Nickname: cemkaner
Registered: Nov, 2007

Re: Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Posted: Dec 19, 2007 1:30 PM

Reply

> As to the ACM search, I am curious how the synonym
> searches for construct validity worked out. It could very
> well be that people are describing the same concept with
> different terms; it happens all the time. How did the
> other 48,000 papers check out? I am certain you did good
> research, and that you just abbreviated this description
> to make your point. It would be interesting to hear more
> about how you measured the presence or absence of
> 'construct validity' in the actual approaches taken in all
> these papers.

I searched in a pretty wide variety of ways over several years because I couldn't believe that the concept was so weakly addressed. It doesn't matter what those strategies were because you can always argue that they are insufficient to prove the negative (some other search for some other synonym that I haven't tried could always yield undiscovered gold...) The reason that I report numbers against "construct validity" is that this phrase is widely used across several disciplines. The lack of reference to it is an indicator, in itself, of the disconnect between software engineering measurement researchers and the broader measurement theory community.

The primary way that I have seen construct validity addressed in texts on software measurement (I have taught from several and reviewed several more--perhaps all of the books marketed as suitable as metrics course texts) is indirectly, through the representational theory of measurement. If a metric satisfies all of the requirements of the representational theory (and I haven't seen a serious claim that any of them do, just several critiques of metrics that don't), then it will almost certainly have construct validity. However, the head-on confrontation with the question," what is the underlying attribute we are trying to measure and how does this relate to that?", is almost always buried. I have been repeatedly disappointed by the brevity and shallowness of this discussion in books on software-related metrics that I have taught from or considered teaching from.

Apart from my own searches, I have also challenged practitioner and academic colleagues to help me find better references. Some of my colleagues have worked pretty hard on this (some of them also teach metrics courses). So far, all we've found are representational theory discussions.

Maybe you've found better somewhere. If so, maybe you could share those references with us.

Bob Austin writes in his book about his interviews with some famous software metrics advocates and how disappointed he was with their naivete vis-a-vis measurement theory and measurement risk.

>
> Another interesting thing you said: on coverage. You ask
> what it means. If you wanted a clearer answer of what it
> measures, I recommend an interesting survey paper by Hong
> Zhu, Software Test Adequacy Criteria (it's in the ACM dl)
> that examined most of the, up-to-that-point work on
> testing adequacy criteria. It seems quite appropriate
> given that coverage is one adequacy criteria that could be
> measured. There are many criteria like def-use paths,
> state coverage, and so on and on and on.

Yes, yes, I've read a lot of that stuff.

Let me define coverage in a simple way. Consider some testable characteristic of a program, and count the number of tests you could run against the program, with respect to that characteristic. Now count how many you have run. That percentage is your coverage measure. You want to count def-use pairs? Go ahead. Statements? Branches? Subpaths of lengths N? If you can count it, you can report coverage against it.

Understand that coverage is not only countable against internal, structural criteria. When I was development manager for a desktop publishing program, our most important coverage measure was percentage of printers tested, from a target pool of mass-market printers. At that time, we had a lot of custom code tied to different printers. Each new working printer reflected a block of capability finally working. It also reflected a barrier to product release being removed, because we weren't going to ship until we worked with our selected test pool. For us, at that time, on that project, knowing that we were at 50% printer coverage was both a meaningful piece of data and a useful focuser of work.

We can measure coverage against assertions of the specification, coverage of individual input/output variables (count the number of variables and for each one, test minima, maxima, out-of-bounds, and special cases), combinatorial coverage (all-pairs, all-triples, all-quadruples, whatever your coverage criterion is for deciding which variables to test to what degree of interaction with other variables).

At a meeting of the Software Test Managers Roundtable, we identified hundreds of potential coverage measures. I listed 101 (just to provide a long sample of the huge space possible) coverage measures in my paper Software Negligence & Testing Coverage, http://www.kaner.com/pdfs/negligence_and_testing_coverage.pdf

So, yes, there is a lot of ambiguity about what "coverage" means.

It is seductive to identify specific attributes as THE attributes of interest, but if you focus your testing on attribute X, you will tend to find certain types of errors and miss other types of errors. Complete coverage against X is not complete coverage. It is just complete coverage against X.

For example, suppose we achieve 100% statement coverage. That means we executed each statement once. In an interpreted language, this is useful because syntax errors are detected in real time (at execution time) and not during compilation. So 100% statement coverage assures that there are no syntax errors (unnecessary assurance in a compiled language, because the compiler does it already). However, it offers no assurance that the program will process special cases correctly, that it will even detect critical special cases (if there are no statements to cover divide-by-zero, you can test every statement and never learn that the program will crash when certain variables take a zero value.) You never learn that the program has no protection against buffer overflows, that it is subject to serious race conditions, that it crashes if connected to an unexpected output device, that it has memory leaks, that it corrupts its stack, that it adds input variables together in ways that don't guard against overflow, and on and on and on.

When you focus programmers / testers on a specific coverage measurement, they optimize their testing for that. As a result, they achieve high coverage on their number but low coverage against the other attributes. Brian Marick has written and talked plenty about the ways in which he saw coverage-focused testing cause organizations to achieve better metrics and worse testing. This is the kind of side effect Bob Austin wrote about, and the kind that almost none of the metrics papers in the ACM/IEEE journals even mention the possibility of.

People often write about their favorite coverage metric as "coverage" rather than "coverage against attribute X" -- but if by "coverage", we want to mean how much of the testing that we could have done that we actually did, then we face the problem that the number of tests for any nontrivial program is essentially infinite, even if you include only distinct tests (two tests are distinct if the program could pass one but fail the other). If we measure coverage against the pool of possible tests rather than against attribute X, our coverage is vanishingly small (any finite number divided by infinity is zero).

>
> It seems that the whole point with metrics is to put them
> into context, understand the narrow story they tell about
> the system being measured and then make intelligent
> decisions. To throw complexity or coverage out completely
> seems to insist that since we have no perfect answers we
> should give up and go home.
>
> Your statement about complexity was a further curiosity. I
> think you made a slight equivocation. When someone tells
> you about the complexity as measured by the decision
> points, I hope it is understood by both of you that you
> are using jargon. "Complexity" in this instance only
> references McCabe's work. And hopefully, you both realize
> that within that context it is a measure (or a metric) for
> an aspect of the system that seems to be somewhat
> correlated with defect density (check McCabe's 96 NIST
> report where he points to a couple of projects that saw a
> correlation.) Based on that context, a complexity score is
> possibly a useful thing to know and to use for improving
> the software.

McCabe's metric essentially counts the number of branches in a method. Big deal.

Structural complexity metrics, which are often marketed as "cognitive complexity" metrics, completely ignore the semantics of the code. Semantic complexity is harder to count, so we ignore it.

Yes, structural complexity is one component of the maintainability problem. But so is comprehensibility of variable names, adequacy and appropriateness of comments, coherence of the focus of the method, and the underlying difficulty of the aspect of the world that is being modeled in this piece of code.

Defining a metric focuses us toward optimizing those aspects of our work that are being measured. And taking work / focus away from those aspects that are not being measured. Choosing to use a structural "complexity" metric is a choice about what kinds of things actually make code hard to read, hard to get right, hard to fix, and hard to document.

I've seen some of the correlational studies on structural metrics. Take some really awful code and some really simple code. Those are your anchors. The simple, reliable code has good structural statistics, the awful code is terrible by any measure, and the correlation will show up as positive because of the end points even if the intermediate values are almost random.

If you want to figure out what aspects of programs create complexity, one of the obvious ways is to put code in front of people and assess their reactions. How complex do they think it is? (People can report their level of subjective complexity. Their reports are not perfect, and there are significant practice effects before irrelevant biasing variable get weeded out, but we ask questions like this all the time in psychophysical research and get useful data that drives advances in stereo systems, perfumes, artificial tastes in foods, lighting systems, alarms, etc.) You can also measure how long it takes them to read the code, where duration is measured as the time until they say that they feel like they understand the code. Or you can suggest a specific code change and see how long it takes them to successsfully change the code in that way. We have plenty of simple dependent variables that can be used in a laboratory setting. The research program would crank through different attributes of software, comparing the impacts on the dependent variables. This is the kind of work that can keep a labful of grad students busy for a decade. I'd be surprised if it wasn't fundable (NSF grants). I've been astonished that it hasn't been done, it's so obvious. (Yes, I know, I could do it. But I have too many projects already and not enough time to do them.)

>
> Later you say, "When we try to manage anything on the
> basis of measurements that have not been carefully
> validated, we are likely to create side effects of
> measurement ...
> There is a lot of propaganda about measurement, starting
> with the fairy tale that "you can't manage what you don't
> measure." (Of course we can. We do it all the time.)"
>
> So, this seems to contradict itself. If I understood the
> aphorism about managing and measuring, admittedly I
> haven't heard Tom DeMarco say it personally, what I took
> it to mean is that there is an implied "good" after the
> word 'managing'. That is, he was saying, we cannot do a
> good job managing without measuring.

Are you aware that Tom has repeatedly, publicly retracted this comment?

>
> As to your summary point, I think we agree. It takes a lot
> of thinking to do metrics right. Most people get them
> wrong. We should spend tons of money on research that
> validates metrics. (I am willing to co-write a grant to
> study crap4j if anyone is game?)
>
> What I disagree with is a perception that metrics are not
> useful, that we are managing just fine without them, and
> that because some people misuse them (over and over again
> no less) that nobody should use them without exorbitant
> expenditures of time and money. It sounds a lot like
> trying to ignore the problem.

I spent a lot of years developing software and consulting to development companies before coming back to universities. Almost no one had metrics programs. Capers Jones claimed that 95% of the software companies he'd studied didn't have metrics programs. I hear time and again that this is because these companies lack the discipline or the smarts. What I heard time and again from my clients was that they abandoned the metrics programs because those programs did more harm than good. It is not that they are ignoring the problem or that they think there is no problem. It is that they have no better alternative to a multidimensional, qualitative assessment, even though that is unreliable, difficult, and inconsistent.

You can cure a head cold by shooting yourself in the head. Some people would prefer to keep the cold.
>
> We must keep trying to improve our measures by studying
> them, by validating them, and by improving them based on
> that study.

Remarkably little serious research is done on the quality of these measures.

> And without a doubt, it requires a coherent
> approach, and a clear understanding of what is being
> measured -- whether we call it construct validity or
> something else.
>
> Are we actually in violent agreement?

One of the not-so-amusing cartoons/bumper-stickers/etc. that I see posted on cubicle walls at troubled companies states, "Beatings will continue until morale improves." OK, obviously, morale is a problem and something needs to be done. But beatings are not the solution. In a dark period of the history of psychology, we got so enamored with high tech that we used the high-tech equivalent of beatings (electroshock therapy) to treat depression. It didn't work, but it was such a cool use of technology that we applied this torture to remarkably many people for a remarkably long time.

We have a serious measurement problem in our field. There are all sorts of things we would like to understand and control better. But we don't have the tools and I see dismayingly little effort to create well-validated tools. We have a lot of experience with companies abandoning their metrics programs because the low-quality tools being pushed today have been counterproductive.

We are not in violent agreement.

I see statistics like crap4j as more crappy ways to treat your head cold with a shotgun and I tell people not to rely on them. Instead, I try to help people think through the details of what they are trying to measure (the attributes), why those are critical for them, and how to use a series of converging, often qualitative, measurements to try to get at them. It's not satisfactory, but it's the best that I know.

-- cem kaner

Cem Kaner

Posts: 4
Nickname: cemkaner
Registered: Nov, 2007

Re: Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Posted: Dec 19, 2007 2:17 PM

Reply

Alberto Savoia wrote:

> Cem,
>
> I believe that software metrics can, and often are, abuse,
> misused, overused, confused, etc. And that A LOT more
> work, experiments, and research is needed to improve the
> current state of affairs. Not to mention educating people
> in the proper ways to use (or not use) metrics.
>
> But if I interpret your post (and position) correctly, the
> only conclusion I can draw is that - as of today - we
> should not be using ANY metric. Zero, nada, nyet. Is that
> the case? Should we burn any and all metrics tool?
> Remove all code coverage from our IDEs?

Sure, I want to achieve 100% statement/branch coverage of code that I write. Before I had tools, I had to use a yellow highlighter on a code listing (really, we did this at Telenova, a phone company I programmed at for 5 years in the 1980's). It was valuable, but it was very tedious. Having simple coverage monitors in Eclipse makes my life easier.

But I think this is the beginning of testing, not the end of it. And in finite time, I am perfectly willing to trade off some other type of testing against this. I see this as a tool for me as a programmer, not as a management tool. As soon as it turns into a management tool, I want to burn it because of the side effects.

If I was assessing someone's code (as their manager), there are several questions that I'd want to balance, such as:

- does it work?
- what evidence has this person collected to suggest that it works, or works well enough?
- is it a straightforward implementation?
- can I understand the code?
- is it usable (either at the UI level or in its interface to the relevant other parts of the application)?
- how much did this cost and why?
- did s/he consider implementation cost and quality explicitly in the design and implementation? What evidence?

Code coverage (take your pick of metrics) is a tiny part of this picture. The more I focus on it, the more tightly I am hugging one tree in a big forest.

One approach is to combine several simplistic metrics (hug a few trees), but just as coverage is a terribly weak indicator of how well the code has been tested, many of these other metrics are weak indicators of whatever they are supposed to measure. Combining them gives an appearance of much greater strength (dashboards or balanced scorecards are very impressive) but they still provide very little information against the questions I'm asking.

The questions are much more critical than the metrics.

It is common to teach an approach to measurement called Goal / Question / Metric. You define a measurement goal ("I want to understand the productivity of my staff in order to manage my project's costs and schedule better") and then a few questions ("What is the productivity of my staff?" "How would a change in productivity impact my costs?") and then a metric or two per question.

One of the exercises we do in my metrics class is to pick a question and take it seriously. Suppose we really wanted an answer to the question. What kinds of information would we have to collect to get that answer? We often come up with lists of several dozen candidates, some of which are easy to translate to numbers and others that need a more qualitative assessment. It is so very tempting to pick one or two easy ones to calculate, declare that these are a sufficient sample of the space of relevant metrics, and then manage on these. And that temptation is so very dangerous in terms of the side effects.

>
> Are you suggesting a full moratorium on all metrics until
> we have invested a few years in "longer term research that
> might provide much more useful answers in the future."?

I am not suggesting a moratorium on management. I am suggesting a moratorium on the hype. I am suggesting a huge increase in the humility index associated with the statistics we collect from our development and testing efforts. I am suggesting a fundamental refocusing on the questions we are trying to answer rather than the statistics we can easily compute that maybe answer maybe some of the questions maybe to some unknown degree with some unconsidered risk of side effects. I am suggesting that we take the risks of side effects more seriously and consider them more explicitly and manage them more thoughtfully. And I am saying that we demand research that is much more focused on the construct and predictive validity of proposed metrics, with stronger empirical evidence--this is hard, but it is hard in every field.

> P.S. Cem, while we might hold different opinions on how to
> improve/fix the state of software metrics, I believe we
> share several common goals. I have enormous respect for
> you , your work, and your passion for software quality and
> testing (which we share.) Not to mention the fact that I
> really like you on a personal level :-). I hope that this
> post is interpreted in the spirit in which it was written
> (i.e. a true desire to confirm my understanding of your
> position, not poke fun at it) and that we can continue
> this discussion in a constructive way that will help us
> (and the readers) gain a better understanding of different
> positions.

Alberto, I wrote my last note (the one on your blog post that follows up to this post), speaking to you by name, because I respect you enough and like you enough to say that I'm disappointed. I've spent a lot of writing hours on this thread--usually I skip blog posts on what I think of as overly simplistic approaches to software measurement, but I put a lot of time into this one because it is your thread. That makes it worth my attention.

-- cem

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Posted: Dec 19, 2007 3:56 PM

Reply

Cem Kaner wrote:
-----------------------------------------------------------
I am suggesting a huge increase in the humility index associated with the statistics we collect from our development and testing efforts. I am suggesting a fundamental refocusing on the questions we are trying to answer rather than the statistics we can easily compute that maybe answer maybe some of the questions maybe to some unknown degree with some unconsidered risk of side effects. I am suggesting that we take the risks of side effects more seriously and consider them more explicitly and manage them more thoughtfully. And I am saying that we demand research that is much more focused on the construct and predictive validity of proposed metrics, with stronger empirical evidence--this is hard, but it is hard in every field.
------------------------------------------------------------

Cem,

I agree with everything you say in the above paragraph. Believe it or not, the goals of C.R.A.P when we started are very similar to the ones you state. Especially the humility part (although, given my personality that translates into "let's not take ourselves too seriously), focusing on specific attributes, collecting data, doing more research, keep the metric and the thinking behind it open so people can do their own experiments, etc.

Here's some unedited text from one of the earliest C.R.A.P. posts in July of this year:

-----------------------

Below is some of our thinking behind the C.R.A.P. index:

[] We believe that software metrics, in general, are just tools. No single metric can tell the whole story; it’s just one more data point. Metrics are meant to be used by developers, not the other way around – the metric should work for you, you should not have to work for the metric. Metrics should never be an end unto themselves. Metrics are meant to help you think, not to do the thinking for you.

[] We believe that, in order to be useful and become widely adopted, a software metric should be easy to understand, easy to use, and – most importantly – easy to act upon. You should not have to acquire a bunch of additional knowledge in order to use a new metric. If a metric tells you that your inter-class coupling and coherence score (I am making this up) is 3.7, would you know if that’s good or bad? Would you know what you need to do to improve it? Are you even in a position to make the kind of deep and pervasive architectural changes that might be required to improve this number?

[] We believe that the formula for the metric, along with various implementations of the software to calculate the metric should be open-source. We will get things started by hosting a Java implementation of the C.R.A.P. metric (called crap4j) on SourceForge.

[] The way we design, develop, and deploy software changes all the time. We believe that with software metrics, as with software itself, you should plan for, and expect, changes and additions as you gain experience with them. Therefore the C.R.A.P. index will evolve and, hopefully, improve over time. In that spirit, what we present today is version 0.1 and we solicit your input and suggestions for the next version.

[] We believe that a good metric should have a clear and very specific purpose. It should be optimized for that purpose, and it should be used only for that purpose. The more general and generic a metric is, the weaker it is. The C.R.A.P. index focuses on the risk and effort associated with maintaining and changing an existing body of code by people other than the original developers. It should not be abused or misused as a proxy for code quality, evaluating programmers’ skills, or betting on a software company’s stock price.

[] Once the objective for the metric is established, the metric should be designed to measure the major factors that impact that objective and encourage actions that will move the code closer to the desired state with respect to that objective. In the case of C.R.A.P., the objective is to measure and help reduce the risks associated with code changes and software maintenance – especially when such work is to be performed by people other than the original developers. Based on our initial studies and research on metrics with similar aims (e.g., the Maintainability Index from CMU’s Software Engineering Institute) we decided that the formula for version 0.1 of the C.R.A.P. index should be based on method complexity and test coverage.

[] There are always corner cases, special situations, etc., and any metric might misfire on occasion. For example, C.R.A.P. takes into account complexity because there is good research showing that, as complexity increases, the understandability and maintainability of a piece of code decreases and the risk of defects increases. This suggests that measuring code complexity at the method/function level and making an effort to minimize it (e.g. through refactoring) is a good thing. But, based on our experience, there are cases where a single method might be easier to understand, test, and maintain than a refactored version with two or three methods. That’s OK. We know that the way we measure and use complexity is not perfect. We have yet to find a software metric that’s right in all cases. Our goal is to have a metric that’s right in most cases.

...

Software metrics have always been a very touchy topic; they are perfect can-of-worms openers and an easy target. When we started this effort, we knew that we’d be in for a wild ride, a lot of criticism, and lots of conflicting opinions. But I am hopeful that – working together and with an open-source mindset – we can fine tune the C.R.A.P. index and have a metric to will help reduce the amount of crappy code in the world.

OK. Time for some feedback – preferably of the constructive type so that C.R.A.P. 0.2 will be better than C.R.A.P. 0.1.

----------------

I'd like to think that the above thinking provides evidence on our part of humility, awareness of the many inadequacies of any metric, potential for misuse, need for focusing on specific attributes (which, for CRAP is maintainability by developers others than the original developers - not quality), testing the predictive power, etc.

Cem Kaner wrote:

-----------------------------------------------------------
And I am saying that we demand research that is much more focused on the construct and predictive validity of proposed metrics, with stronger empirical evidence--this is hard, but it is hard in every field.
-----------------------------------------------------------

I want that too; but in order to test the construct validity and predictive value we need to have some metrics to test with, some people willing to use them on their projects (real world projects) and also willing to share the data as well as their opinion of the metric "readings". You say, "this is hard", and I could not agree more but we gotta start somewhere. The latest version of crap4j offers an embryonic mechanism to encourage data sharing. It's very, VERY, primitive and limited at this time but you can get a flavor of it at:
http://crap4j.org/benchmark/stats/ and use your imagination for how it might be evolved and used.

We don't want to do this work alone. We are looking for other people to push-back, propose and test with completely different measures and formulae, etc. That's why all the code is open-source. Of course, it would be great to have a combination of industry and academic people working on "the next generation of metrics". Given how strongly and passionate you feel about the topic, is this something that you (or some of your students/colleagues) might be interested in?

Alberto

P.S.

Cem Kaner wrote:
------------------------------------------------------------
Alberto, I wrote my last note (the one on your blog post that follows up to this post), speaking to you by name, because I respect you enough and like you enough to say that I'm disappointed. I've spent a lot of writing hours on this thread--usually I skip blog posts on what I think of as overly simplistic approaches to software measurement, but I put a lot of time into this one because it is your thread. That makes it worth my attention.
------------------------------------------------------------

Hopefu lly, by reading some of the material in this reply (as well from previous posts) gives you a bit more context for my last two posts and a better perspective on what we are trying to accomplish.

I also spent a lot of writing hours on these replies for the same reasons you mention (including last night past 11PM - when I told my wife what I was doing she thought I was crazy :-)). I appreciate the respect, return it several fold, and - if at all possible from your end - I would love an opportunity to continue this discussion offline and see if we can find a way to work together, or at least with more awareness of each other, going forward.

Alberto

Thomas Cagley

Posts: 1
Nickname: tcagley
Registered: May, 2008

Re: Software Metrics Don't Kill Projects, Moronic Managers Kill Projects

Posted: May 16, 2008 1:38 PM

Reply

I think the sentiment is correct, good metrics do not kill projects however metrics that do not match corporate / organizational goals can damage if not kill a project (and an organization).

Tom Cagley
Software Process and Measurement Cast
www.spamcast.net

Flat View: This topic has 50 replies on 4 pages [ « | 1 2 3 4 ]

Previous Topic

Next Topic

Sponsored Links

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use