I’m doing a bit of work on XOM, trying to optimize and improve some of the Unicode normalization code. A lot of this is autogenerated from the Unicode data files, and I’m actually working on the meta-code that parses those files and then generates the actual shipping code. In this code, I’m setting up a switch statement like this one:
switch(i) {
case 0:
return result + "NOT_REORDERED";
case 1:
return result + "OVERLAY";
case 7:
return result + "NUKTA";
case 8:
return result + "KANA_VOICING";
case 9:
return result + "VIRAMA";
case 202:
return result + "ATTACHED_BELOW";
case 216:
return result + "ATTACHED_ABOVE_RIGHT";
case 218:
return result + "BELOW_LEFT";
case 220:
return result + "BELOW";
case 222:
return result + "BELOW_RIGHT";
case 224:
return result + "LEFT";
case 226:
return result + "RIGHT";
case 228:
return result + "ABOVE_LEFT";
case 230:
return result + "ABOVE";
case 232:
return result + "ABOVE_RIGHT";
case 233:
return result + "DOUBLE_BELOW";
case 234:
return result + "DOUBLE_ABOVE";
case 240:
return result + "IOTA_SUBSCRIPT";
default:
return result + "NOT_REORDERED";
}
And then I stop myself. Do you see the bug? Actually it’s a meta bug that leads to the true bug.
The problem I’ve noticed is that I really have no justification for supplying a default value. If I parse the Unicode data files, and the Unicode consortium has added a new combining class, this code should fail instead of merely assigning that new class to an existing class. So I change the default block like so to throw an exception instead; not that this will ever happen of course.
default:
throw new RuntimeException("New case! " + i);
}
You can probably guess what happens next: I run the code and the very first time, it throw the exception. It seems I forgot to handle the case of 560. OK. Simple enough to fix. I check the data files to see what case 560 means. Strange, it doesn’t seem to be there. Where’s 560 coming from? And that’s when I realize the problem is much, much worse than I thought.
The real bug is that I’ve been treating hexadecimal constants from the Unicode data as decimal constants in my Java code without conversion. Almost everything the code is doing is wrong. It should look like this:
switch(i) {
case 0x0:
return result + "NOT_REORDERED";
case 0x1:
return result + "OVERLAY";
case 0x7:
return result + "NUKTA";
case 0x8:
return result + "KANA_VOICING";
case 0x9:
return result + "VIRAMA";
case 0x202:
return result + "ATTACHED_BELOW";
case 0x216:
return result + "ATTACHED_ABOVE_RIGHT";
case 0x218:
return result + "BELOW_LEFT";
case 0x220:
return result + "BELOW";
case 0x222:
return result + "BELOW_RIGHT";
case 0x224:
return result + "LEFT";
case 0x226:
return result + "RIGHT";
case 0x228:
return result + "ABOVE_LEFT";
case 0x230:
return result + "ABOVE";
case 0x232:
return result + "ABOVE_RIGHT";
case 0x233:
return result + "DOUBLE_BELOW";
case 0x234:
return result + "DOUBLE_ABOVE";
case 0x240:
return result + "IOTA_SUBSCRIPT";
default:
throw new RuntimeException("New case! " + Integer.toHexString(i));
}
Now the code runs and no classes are missing. But then I get a sinking feeling in the pit of my stomach. What about the other XOM file where I’m using these values? Are those hexadecimal or decimal? Damn it, they’re decimal!
OK. Don’t panic yet. They’re private named constants. The code has been shipping in production for 3+ years now, and no one’s reported a bug here. Maybe it doesn’t matter? It looks like most of the uses of these constants just involve equality comparisons of named constants, so the bug tends to cancel itself out. There are a couple of places where I compare a class to zero, but 0×0 == 0 so that’s safe. And there’s one line that gives me the willies where I compare to classes for relative size, but again if 0×230 > 0×154, then it’s also true that 230 > 154. Whew, that was a close one. I probably would have noticed this years ago except that by freak accident I was working with over 70 different hexadecimal constants, none of which used the digits A-F.
Wait a minute. Is that really plausible? Either someone deliberately set up those constants that way, or the bug isn’t what I thought it was. Maybe they aren’t really hexadecimal constants after all? But then where was that 560 coming from when the data files have 230? Wouldn’t you know it? I was parsing two values out of a line, treating both as hexadecimal, when one is hexadecimal and one isn’t. Does it really make sense to include both hexadecimal and decimal constants in the same line? But there’s not a lot I can do about that, so I change Integer.parseInt(m.group(2), 16) to Integer.parseInt(m.group(2)) and run the program again. At least this means the existing code is safe with its decimal constants.
The problem turns out not to have been as bad as I thought it was. However, it was still a real problem, and one I would not have noticed or found had I simply filled in a default value. The moral of the story is, don’t try to account for code errors you don’t expect. If something “can’t happen”, make sure you throw a RuntimeException when it does. Don’t just silently ignore it. Your intuition about what a program will do is vastly inferior to the experimental observation of what it does do.
Some programmers hate adding uncaught, unhandled runtime exceptions that may surface at unexpected times. (This one, at least, surfaced immediately.) The usual claim is that it’s unnecessary because the problem “can’t happen”. However, if it’s really true that the problem can’t happen, then it won’t happen, and the exception won’t harm anyone. And if they’re wrong and the problem does happen anyway, well then, at least the debugging will be a lot easier. Silent failure is still failure, and is much harder to diagnose and cure than screaming program crash due to an unexpected runtime exception.
I did have one advantage in this case. My program was a one-off piece of throw-away code run only by me. It has to give me the answer once, and then I don’t have to bother with it again. The only user is me, and I know what to do if it breaks. Code that has clients other than the programmer who wrote it, and especially non-programmer end users, may need to be more careful than this. It’s often wise to put a generic catch (RuntimeException ex) block or even a catch (Throwable ex) block in your main() method to more carefully handle unexpected exceptions. Such a block can log and report the problem, while still shielding the end user from the issue.
However an exception caused by a case that “can’t happen” but does is no different in this respect than a library bug or a null pointer exception from your own code. And it is still preferable to have the application fail as early, as fast, and as close to the actual bug as possible rather than silently ignore the failure until some other block of code fails as a result further down the line. Worst of all, the code could well run but generate incorrect data that might even be passed through multiple systems in multiple continents in multiple time zones and organizations before someone notices. Try debugging that some time.

Read: In Praise of Draconian Error Handling, Part 1