The term code smell was popularized by Kent Beck in his book on software refactoring. According to the extreme programming wiki:
A code smell is a hint that something has gone wrong somewhere in your code. Use the smell to track down the problem... Highly experienced and knowledgeable developers have a "feel" for good design... They routinely practice good design without thinking about it too much, they find that they can look at a design or the code and immediately get a "feel" for its quality, without getting bogged down in extensive "logically detailed arguments"...
CodeSmell is a hint that something might be wrong, not a certainty. A perfectly good idiom may be considered a CodeSmell because it's often misused, or because there's a simpler alternative that works in most cases. Calling something a CodeSmell is not an attack; it's simply a sign that a closer look is warranted...
In a recent Simple Design and Testing Conference, Corey Haines and Naresh Jain hosted a session on code smell, focusing on two questions:
As an experienced developer, looking back, what do you think is the stinkiest code smell that has hurt you the most? In other words, which is the single code smell if you go after eradicating, most of the design problems in your code would be solved?
There are so many different principles and guidelines to help you achieve a good design. For new developers where do they start? Which is the one code smell or principle that we can teach new developers that will help them the most as far as good design goes (other than years of experience)?
In a recent blog post, Biggest Stinkers, Naresh Jain summarizes some of the conclusions from the session.
What were the biggest code stinkers in your experience?
I found a macro in a colleague's code that contained a 'return' and was named 'BOGO'. When asked to sort it out he merely changed the name to 'OBONGO'. He would also name various features after himself.
I would definitely say Code Duplication or Copy-paste programing. This is also a practice that I think rarely can be excused by "lack of time", since in a modern IDE, breaking out a method or extracting a superclass is done in a matter of seconds. So in my opinion, this is often a sign that unprofessional and/or unskilled developers has been in touch with the code.
I think the worst practice is lack of proper error handling. in my experience bugs where something appears to work but did not are the most problematic. The tend to go undetected.
I agree and I've seen while while working with less experienced programmers in C++ (or Java, Python and other exception provided languages) especially when transitioning from C or FORTRAN: they tend use exception in the very same way as they use the return codes. Although code standards try to give advice on error handling there's no universal policy: it depend on a per case basis and normally it becomes clearer at much later stage if a mistake has been done (kind of a catch-22).
But my favorite code stinker I've seen so far was the use of this-> in C++ classes: the reason for this pedantic approach by the poor guy was manifold. The original classes designer used to put several hundred attributes and methods public: they weren't structure-like classes and the attributes were referenced within the very same methods. The poor guy (that left the company before me) had to make it clear about what belonged to the class itself and what was locally declared with often overlapping names. In turn this mess was created by the very same "designer" not too keen in putting the usual #pragma once/#ifdef,#define guard in the header files with the very "funny" results that there was no more distinction between headers and module definitions, compiling the code required ages and the IDE couldn't figure the variable name scoping correctly anymore. In the end I've left the company because this "designer" refused to talk (at all) to me for three years making my life in an otherwise nice place a misery.
I always been a proponent of putting generic functionality into a common library whenever it can be identified. I really hate having to modify code where there are at least three different versions of the same function, all with different APIs, some with error trapping, some without.
> But my favorite code stinker I've seen so far was the use > of this-> in C++ classes:
We had one of those, but it was done because he used it to trigger Visual Studios 'code completion' to list the class members. He was too lazy to remove them afterwards.
Come to think of it, I think one of my indicators of 'code smell' is when you can easily see that the coder didn't really pay attention to detail. Messy/hacked-together code should raise alarm bells.
Code duplication is the most obvious one. But besides that, anything that is large. A function with too many lines. An enumeration of too many items. A class with too many data members or too many methods. A file with too many functions. When something is BOTH large AND code-duplicated hell is occuring..... :-(
To me, cut and paste coding is the biggest stinker. Especially when the various copies slowly drift apart. I worked on a project where all the JDBC queries were copied all over the place, then each used a slightly different way to obtain their Connection, and I had no real idea if that was deliberate or not. And none of the code closed the Connection in a finally clause...
As for a class being "too big", I'm less concerned. IMO, a class should do everything the class *should* do. Behaviorally complete. If that gives it 50 methods, I'm o.k. with that. Obviously, agreement on what a class "should" do is problematic.
Copy-paste is evil, but I would say poor/bad identifiers are even more so, especially when a large proportion of code is cobbled together using smelly identifiers. When authors aren't skilled enough to invent proper, accurate names for their classes, methods, fields and parameters, then the resulting code is a real nightmare to work with. Good job we've got Rename to try and fix this, one name at a time. In my experience, when class or method names defy rescue via a Rename, you're looking at code that requires a complete rewrite, lest you allow the entropy to increase exponentially.
One that hasn't been mentioned: Using blacklists instead of whitelists ("making failures work" or "don't constrain access to data")
E.g, template languages that expand missing variables to "" instead of failing to display, or creating databases without setting up the proper constraints, or filtering out dangerous characters in something instead of allowing safe characters.
This is the kind of smell that has probably cost me the most time. I remember once using about a month of time to clean up data because a developer half a year earlier had not taken the three hours to set up DB constraints. I've had several other cases where this kind of thing has dragged out fixes that should have taken hours to be a drudging "test out what happens with two thousand templates" effort.