Let's Reconsider That
Is it Academic?
by Michael Feathers
June 27, 2005

Summary
What is the role of academic research in software engineering? Should it happen at all?

I just arrived back from the XP2005 conference. It was held last week in Sheffield, UK. I was a little scared that it wouldn't be as vibrant as some of the other conferences I've been to recently because it didn't seem to be publicized very much, however the content was very good. And, I'm glad that I attended some of the workshops. The panels were interesting, and then of course there were the hallway conversations. To me, hallway conversations are often the most educational part of a conference. So many people, so many ideas. It was fun.

When I was ready to leave, however, I started to think about the conference a bit more solemnly. XP2005 was the sixth of a series of conferences on 'Extreme Programming and Flexible Processes in Software Engineering.' The conferences started with XP2000 in Sardina, organized by Dr. Michele Marchesi. Each year since then, it has been organized at a different location in Europe, well attended, and always interesting. However, of all the conferences I've ever visited, the XP200x conferences seem to be the most schizophrenic. They are usually hosted by people in academia, and they have a large number of referred papers, but the papers just don't seem to impact attendees much. Or, at least, they aren't a very large part of the conversation that floats around the conference. The XP conferences aren't alone in this. There are a couple of other large conferences (OOPSLA is one) which attempt to colocate researchers and practitioners. Walk down any hallway in OOPSLA and you'll find people who are mulling the intricacies of Java 5 annotations and others that are mulling about typed calculi for objects. Only a small subset of them mingle much, but it still feels like they have more in common than the researchers and practitioners at XP2005. Perhaps, I shouldn't be surprised at this. XP is definitively a practitioner's discipline. I fall in that camp, and it is pretty easy for many practitioners to feel that, well, often academic research doesn't have much relevance for us out in the field. But we should really ask ourselves whether that is really true, and if so whether there is anything that can be done about it.

Let's look at an example. At the conference, there was a paper that was titled Testing with Guarantees and the Failure of Regression Testing in eXtreme Programming. You'd think that a paper with that title would raise a few eyebrows at an XP conference, but the eyebrows I saw were notably relaxed. I can paraphrase the author's argument like this. There are refactorings which can increase the number of possible states a program can be in. If we write tests for some functionality and then use one of these refactorings, we could introduce an error that the tests wouldn't catch. For this reason, it doesn't make sense to say that the unit tests that we write when we use test-driven development are sufficient regression for refactoring. The author of the paper uses a stack as an example. He gives an abstract specification of a stack in the form of a state chart, and describes the tests he would use to drive the creation of the stack using a list. The refactoring that he proposes is to change the stack so that it uses an array that grows in increments. He gives another state chart and shows that the new code can only be covered if he adds more tests (a couple more push calls to be exact). Although it isn't mentioned in the paper, the refactoring is 'Substitute Algorithm' from Martin Fowler's Refactoring book, and it isn't clear to me that it is even a refactoring when used that way, because writing explicit code to grow an array doesn't seem to have any maintenance advantages over using a list. But that's a technicality. Perhaps another example would be better, and doubtless there is one.

My first reaction when I read the paper was: Okay, but I wouldn't do it that way. If I was adding code to grow a stack in increments, I'd definitely add tests to make sure that I was handling the transition from one array size to another. But, then I thought, well, there is an important point here. In the XP community, we often do tell people that the tests that they've written so far protect them when refactoring. Anyone who doesn't wait for the qualification -- the statement that it isn't perfect -- might think that they can mechanically produce correct code without thinking about their test cases at all.

Good point, I said to myself. It wasn't the researcher's point, but it reminded me of a fact about testing and refactoring and it will affect how I teach people in the future. But, where does that leave us with regard to the paper? It definitely didn't prove the "failure of regression testing in eXtreme programming" (although the title did attract my attention). What it did do was set up an artificial condition and show how an XP guideline, applied mechnically, fails under that specific condition.

I don't know what we can expect from software engineering research. It's admittedly hard and expensive to do experiments with real teams. And, frankly, many of the more pernicuous problems in the field are social rather than technical. I think the thing that bothers me most about much current research, is the fact that there are much more obvious questions to ask. Researchers could ask: is there an appreciable bug reduction when using TDD (my experience makes it obvious to me that there is, but it would be nice to have studies to point to), and if there is, what can it be attributed to? My own pet theory is that the quality improvement doesn't really come from coverage or the number of tests that we write directly as much as the mode of thought that we have to get into to write them. Tests encourage more rigorous thought, and better thought in my opinion. It's thought that's grounded in immediate feedback. Am I right? I don't know, but I think it is an interesting question, and it leads to others. What is a "mode of thought"? How do we test such a thing? If we discovered that this mode of thought is the thing that leads to quality, would we do anything differently?

A few months ago, I was talking to Steve Freeman, and he mentioned that maybe the programming community should be reaching for some different academic connections. Maybe the research connections we need should come from social sciences, operations research, and psychology? It makes sense to me. I think there is still a role for research in software engineering, but the questions we need answers to now are a bit different, and the computer science toolkit only extends so far.

Talk Back!

Have an opinion? Readers have already posted 14 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Michael Feathers adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Michael has been active in the XP community for the past five years, balancing his time between working with, training, and coaching various teams around the world. Prior to joining Object Mentor, Michael designed a proprietary programming language and wrote a compiler for it, he also designed a large multi-platform class library and a framework for instrumentation control. When he isn't engaged with a team, he spends most of this time investigating ways of altering design over time in codebases.


	Web Artima.com