One of my favorite XP early day stories was some time I spent talking with Ward Cunningham at the original XP workshop in '99. Ward was describing a testing experience where he was writing a little game of life thing, and he was doing tests. He was writing a fixed size game, so his tests were for a fixed size grid. At one point, someone (Beck maybe?) convinced him to make the tests be for a generalized size, and this apparently had a positive effect on the process. Abstractiions became apparent that made the game and tests easier to write, and run faster.
One of the things that worries me about this story, is that I might have myth-ized it as the years have moved on. I apologize up front for any misrepresentations.
Somehwere in there, I came to call one aspect of this the Test Tautology problem. "Test Tautology" is basically when your test code says the same thing, has the exact same algorithm, as the code it's testing. If there's an error in the algorithm, you tend to duplicate it. I've come to the conclusion that you really want find a different way to express the same thing if you can. If you can't, it's possible you're testing at the wrong point, or that your testing meaningless stuff and wasting time that would be better spent on productive coding.
One example I thought of in this regard recently was how one would write tests for and then code for a widget that center's text. One might start with a test that tested left alignment. And then add a test for right alignment. And then the code. And then another test for centering, and of course then the code. Likely, the way this would go down is with some sort of symbol or other flag, and code that "branches" for each type of alignment.
If we back up a bit though, we can look at the "alignment" algorithm in abstract. It's really nothing more than
(containerWidth - textLineWidth) * someFraction
You can factor it the other way, but it works either way. Left is 0.0. Center is 0.5. Right is 1.0. Those just become helpers though. Your test is for an arbitrary fraction or two. And in the end, you've got something that can do the standard 3, but can do zany things like 0.4 too. Is that valuable? I don't know. The point is that this kind of test and code end up being less branchy and easier to trust in my experience. A "less is more" sort of thing. And a different way to express the center/left/right API.