What is the role of academic research in software engineering? Should it happen at all?
I just arrived back from the XP2005 conference. It was held last week in Sheffield, UK. I was a little scared that it wouldn't be as vibrant as some of the other conferences I've been to recently because it didn't seem to be publicized very much, however the content was very good. And, I'm glad that I attended some of the workshops. The panels were interesting, and then of course there were the hallway conversations. To me, hallway conversations are often the most educational part of a conference. So many people, so many ideas. It was fun.
When I was ready to leave, however, I started to think about the conference a bit more solemnly. XP2005 was the sixth of a series of conferences on 'Extreme Programming and Flexible Processes in Software Engineering.' The conferences started with XP2000 in Sardina, organized by Dr. Michele Marchesi. Each year since then, it has been organized at a different location in Europe, well attended, and always interesting. However, of all the conferences I've ever visited, the XP200x conferences seem to be the most schizophrenic. They are usually hosted by people in academia, and they have a large number of referred papers, but the papers just don't seem to impact attendees much. Or, at least, they aren't a very large part of the conversation that floats around the conference. The XP conferences aren't alone in this. There are a couple of other large conferences (OOPSLA is one) which attempt to colocate researchers and practitioners. Walk down any hallway in OOPSLA and you'll find people who are mulling the intricacies of Java 5 annotations and others that are mulling about typed calculi for objects. Only a small subset of them mingle much, but it still feels like they have more in common than the researchers and practitioners at XP2005. Perhaps, I shouldn't be surprised at this. XP is definitively a practitioner's discipline. I fall in that camp, and it is pretty easy for many practitioners to feel that, well, often academic research doesn't have much relevance for us out in the field. But we should really ask ourselves whether that is really true, and if so whether there is anything that can be done about it.
Let's look at an example. At the conference, there was a paper that was titled Testing with Guarantees and the Failure of Regression Testing in eXtreme Programming. You'd think that a paper with that title would raise a few eyebrows at an XP conference, but the eyebrows I saw were notably relaxed. I can paraphrase the author's argument like this. There are refactorings which can increase the number of possible states a program can be in. If we write tests for some functionality and then use one of these refactorings, we could introduce an error that the tests wouldn't catch. For this reason, it doesn't make sense to say that the unit tests that we write when we use test-driven development are sufficient regression for refactoring. The author of the paper uses a stack as an example. He gives an abstract specification of a stack in the form of a state chart, and describes the tests he would use to drive the creation of the stack using a list. The refactoring that he proposes is to change the stack so that it uses an array that grows in increments. He gives another state chart and shows that the new code can only be covered if he adds more tests (a couple more push calls to be exact). Although it isn't mentioned in the paper, the refactoring is 'Substitute Algorithm' from Martin Fowler's Refactoring book, and it isn't clear to me that it is even a refactoring when used that way, because writing explicit code to grow an array doesn't seem to have any maintenance advantages over using a list. But that's a technicality. Perhaps another example would be better, and doubtless there is one.
My first reaction when I read the paper was: Okay, but I wouldn't do it that way. If I was adding code to grow a stack in increments, I'd definitely add tests to make sure that I was handling the transition from one array size to another. But, then I thought, well, there is an important point here. In the XP community, we often do tell people that the tests that they've written so far protect them when refactoring. Anyone who doesn't wait for the qualification -- the statement that it isn't perfect -- might think that they can mechanically produce correct code without thinking about their test cases at all.
Good point, I said to myself. It wasn't the researcher's point, but it reminded me of a fact about testing and refactoring and it will affect how I teach people in the future. But, where does that leave us with regard to the paper? It definitely didn't prove the "failure of regression testing in eXtreme programming" (although the title did attract my attention). What it did do was set up an artificial condition and show how an XP guideline, applied mechnically, fails under that specific condition.
I don't know what we can expect from software engineering research. It's admittedly hard and expensive to do experiments with real teams. And, frankly, many of the more pernicuous problems in the field are social rather than technical. I think the thing that bothers me most about much current research, is the fact that there are much more obvious questions to ask. Researchers could ask: is there an appreciable bug reduction when using TDD (my experience makes it obvious to me that there is, but it would be nice to have studies to point to), and if there is, what can it be attributed to? My own pet theory is that the quality improvement doesn't really come from coverage or the number of tests that we write directly as much as the mode of thought that we have to get into to write them. Tests encourage more rigorous thought, and better thought in my opinion. It's thought that's grounded in immediate feedback. Am I right? I don't know, but I think it is an interesting question, and it leads to others. What is a "mode of thought"? How do we test such a thing? If we discovered that this mode of thought is the thing that leads to quality, would we do anything differently?
A few months ago, I was talking to Steve Freeman, and he mentioned that maybe the programming community should be reaching for some different academic connections. Maybe the research connections we need should come from social sciences, operations research, and psychology? It makes sense to me. I think there is still a role for research in software engineering, but the questions we need answers to now are a bit different, and the computer science toolkit only extends so far.
There are people out there doing empirical research in software engineering, but unfortunately not that many. It's hard for a researcher to collect any data on "real" (i.e. non-toy) projects, forget about controlled experiments. It's really hard to measure the variables of interest in software engineering (Measuring effort is surprisingly hard, and the community has no accepted definition of community). And the truth is that this kind of work just isn't very sexy. Computer science types are generally not interested in doing social-science style experiments and observing human subjects. Even if they were, they certainly aren't trained in it.
Maybe you should come to Australia and check us out!
Seems like an opportunity for practioner organizations to reach a rough consensys on list the priority items that effect software development. (people, organizational, technical, etc) Companies and governments with lots of developers have an incentive to improve their development efficiency, and hence have an incentive to get some objective analysis of those issues. Academics offer relatively objective analysis. Companies and governments could sponsor such analysis. Organizations like the IEEE and ACM could help play matchmaker. Academics could propose such deals.
Maybe there are some missing participants in the conferences. Business people and entreprenuers. Who has the biggest incentive to have software development done really well? Seems like owner managers.
At some point at university I realized that Software Engineering was not as technical as I had thought, but involved lots of social issues. I still took my degree, but being more fond of technical aspects, I made it almost as technical as I could...
(Side-note: One of my friends, also with a degree in SE, changed to computer science for his Ph.D studies in order to get out of the social and into the technical nitty-gritty... :-)
> There are people out there doing empirical research in > software engineering, but unfortunately not that many. > It's hard for a researcher to collect any data on "real" > (i.e. non-toy) projects, forget about controlled > experiments. It's really hard to measure the variables of > interest in software engineering (Measuring effort is > surprisingly hard, and the community has no accepted > definition of community). And the truth is that this kind > of work just isn't very sexy. Computer science types are > generally not interested in doing social-science style > experiments and observing human subjects. Even if they > were, they certainly aren't trained in it.
I agree. Have you seen any attempts at connection with researchers in social sciences?
> Seems like an opportunity for practioner organizations to > reach a rough consensys on list the priority items that > effect software development. (people, organizational, > technical, etc) Companies and governments with lots of > developers have an incentive to improve their development > efficiency, and hence have an incentive to get some > objective analysis of those issues. Academics offer > relatively objective analysis. Companies and governments > could sponsor such analysis. Organizations like the IEEE > and ACM could help play matchmaker. Academics could > propose such deals. > > Maybe there are some missing participants in the > conferences. Business people and entreprenuers. Who has > the biggest incentive to have software development done > really well? Seems like owner managers.
Thanks, Kelley. A week ago I was with Bob Martin at a conference and he was approached by someone at corporation who was trying to decide what research to sponsor. I think that is part of the answer: more sponsorship. I have an idea, though. I think that if, at conferences where researchers and practitioners were colocated, they surveyed the practitioners and had them rank the papers by relevance, it could give researchers a bit of feedback.
> I have an idea, though. I think that if, at conferences > where researchers and practitioners were colocated, they > surveyed the practitioners and had them rank the papers by > relevance, it could give researchers a bit of feedback. Great idea Michael. Maybe practioner orgs could administer the ranking. Maybe even an on-line ranking of papers, administered by: conference promoters, IEEE, ACM, Agile Aliance, etc. There may be some synergy in the ranking. Feedback to researchers. + Helps practitioners find interestng papers. + Helps conference promoters decide what to do next year. + helps publishers decide what to publish. + plus helps practitioner orgs attract people to their websites. Win-Win.
I believe the company backed open source projects might be usefull as "real world experiments". While not exactly meeting the "corporate standard", they're the best one can get for free. Especially considering it's probably the social aspect that researchers might be most interested in.
> I believe the company backed open source projects might be > usefull as "real world experiments". While not exactly > meeting the "corporate standard", they're the best one can > get for free. Especially considering it's probably the > social aspect that researchers might be most interested in.
Sounds like a much better kind of research than all those "open source survey" and "open source study" e-mails sent by students all over the world to my @sourceforge.net e-mail address!
<Maybe the research connections we need should come from social sciences, operations research, and psychology?>
If people like you and Richard Gabriel don't stop this, Michael, we are in danger of approaching some genuine understanding and worthwhile learning here. There's progressive and there's radical, but you can take this stuff too far, you know! ;-)
Thanks for the continuous and continuing stream of good ideas. :-)
> > (Measuring effort is > > surprisingly hard, and the community has no accepted > > definition of community).
Whoops, I meant to say the community has no accepted definition of productivity.
> Have you seen any attempts at connection with > researchers in social sciences?
As it happens, on the project I'm currently working on (as a lowly grad student), there are three social scientists that are also involved: a cognitive psychologist, a cognitive scientist, and a cultural anthropologist. (Two work for IBM, one works for Sun. Interestingly, IBM actually has a social computing group http://www.research.ibm.com/SocialComputing).
Otherwise, from what I've seen (and I can't really speak for the empirical community), there's been some borrowing of methods from experimental psychology to study individual programmers, but I haven't seen much software engineering research being done by or in collaboration with actual psychologists. (A notable exception that comes to mind is Janice Singer http://iit-iti.nrc-cnrc.gc.ca/personnel/singer_janice_e.html, who does research in software engineering and has a social-science background, if I recall correctly.)
I also haven't seen much sociology/anthropology style research that deals with issues like social interaction, culture, etc. Most of the social-science-like SE work I've seen is at the small-scale controlled-experiment level.
I think the human-computer interaction researchers are currently doing better than we are at collaborating with social scientists. For example, I know that Kent Norman (http://www.lap.umd.edu/LAPFolder/People/kent_norman/) at the University of Maryland is a psychology researcher that collaborates with the HCI group at the university.
I suspect that the gap (between the ideal of relevant, useful research and what he have) says a more about the kind of people who are drawn to CS than anything else. I get the impression that the soft warm and fuzzy stuff that is needed to build good development teams makes a lot of CS people uncomfortable. I took a sojourn from software development in the early 90s and started a Masters degree in psychology. It was clear that, by and large, most research psychologists were happy to be research psychologists and the field had well-established approaches to research, both qualitative and quantitative.
The CS software engineering research proposals I saw seemed, well, plain weird as if they were studying some mutant idea of software development rather than the real activity as practiced. I couldn't grasp why this was as scientists of all flavors typically have personal experience of software development so this ungroundedness seemed surprising.
"Software engineering" could benefit from larger qualitative research projects that use real world development projects, and borrowed social psychologists, socioligists, anthropologists, etc.
I thought I'd pipe in about empirical software engineering. In academia, there are groups of researchers that try to use empirical studies to explore/discover laws that could govern software. In many cases, people specialising in metrics have tried to justify why metrics are useful by using experiments. Consequently, there are multitudes of studies to try to find a relationship between metrics (like coupling or cohesion metrics) and software qualities (like fault-proness).
The experimentation techniques are borrowed from behavioural studies. Personally, I believe that the main problem with academia is the lack of representative data. Heck even in industry interesting data (metrics) are rarely available. What SE academics can focus on are controlled experiments where all influencing factors can be controlled and varied (this is too expensive in most industrial contexts). The result of these controlled experiments could then be studied in a real industrial context (in a case study).
I think you're heading towards the area of Industrial Psychology. My understanding of that field is that it's concerned with the human interactions underlying the manufacturing process.
I think this gets to the point that true "computer science" is a very specific field that concerns proveable algorithms, while the larger discipline of software development is (should be) much closer to industrial psychology.
A specific example: the successful design of a Java development/runtime framework depends largely on how well people can understand it and modify software produced in that framework, not on the ability to verify requirements against produced code (although this is important too).
At a measurable level, this comes down to something close to the old minimizing of hand-eye movements in a manufacturing environment.
How many computer science theorems mention minimizing hand eye movements?