In hindsight, it seems strange it has taken us so long to figure this out in the computing field. Applications elsewhere might be even more important.
In programming, the use of unit tests and acceptance tests are probably still overshadowed by "it works for me" approaches. It seems easier to get things to work once, declare them "done" and move on. However, once you understand how very coupled programs are, you also understand that any change to a program can cause unexpected effects to appear throughout the rest of the program. Upon seeing this, the only solution is relentless, automated testing. In fact, the rise of automated test suites shows the failure of decoupling. Decoupling is nonetheless still a good guideline -- it's just not a solution for program correctness and stability.
The leaders in our community have seen that you can't just say something works a particular way. You have to test for it. The reason you need tests is because what you're testing can change over time, and this might in fact be why we tripped up. When you apply the scientific method to discover some principle of the universe, once that principle is uncovered it stays the same. By that reasoning, our code, once written, should also stay the same.
But the reason scientific theories are called so is that, while your current tests might produce results that support the theory, someone might come up with a test in the future that requires a change to the theory. In evolution, for example, it was found that species are stable for periods, then undergo sudden changes -- this is called "punctuated equilibrium" and required changes to the way we look at the theory. In a stretch, you could think of it as "micro-creationism" but for people who don't like evolution this won't help, because all variations of evolutionary theory has things changing -- evolving -- and if you don't like the fact that things change the only thing that will make you happy is a complete repeal of evolution; we'd have to pretend that we don't see species changing over time.
The reason I'm rehashing the basics of science is this: when people are involved, we can't assume that we're dealing with anything as relatively stable as scientific theories. People have the ability to be temporarily anti-entropic, to push things in directions that are ultimately "counter to the natural order of things" (I like to think that things like information and self-actualization are similar to entropy; that they somehow "want" to percolate through the world).
Gerald Weinberg says "Things are the way they are because they got that way ... one logical step at a time." I personally think that the word "logical" is a bit tongue-in-cheek; what I think he means is "logical based on the pressures at the moment," because this maxim is his way of explaining how companies and teams get themselves into completely contorted positions: one "logical" step at a time. If you look at each step, given the conditions and pressures at the moment that decision was made, the individual decision is "logical," even though the sum total of all the decisions may put the team or company into a completely crazy place.
There are two ways to help this situation. The most amazing solution would be to figure out the right set of pressures to prioritize so that the decisions are always going in the right direction to keep a company or team from going to crazy places. I don't think we yet have the tools to do this kind of social engineering; we've only begun to see the possibilities here.
So in the meantime, we need to fall back to the second approach, the one we discovered in our relatively newfound world of software development. Software tends to creep in bad directions, so we have to test it all the time. Human groups (companies and teams) also tend to creep (at least, when you don't know enough to socially engineer them) and so we need tests for those, too.
The example I'm working up to is the company philosophy and mission statement. If you've seen enough of these and the companies they are supposedly guiding, you know there's an almost universal disconnect between what a company says they're about and what they actually do. No one takes the company philosophy seriously because they know that the leaders of the company don't take it seriously. And that's because there's no way to know when it's been violated.
That's why we need tests. Things creep, and unless you know (A) when you need to raise the red flag and (B) that someone will pay attention when you do, a company philosophy is meaningless. Laughable, even.
I didn't realize this until I saw the NetFlix company culture statement. Although this is in slide form, it was meant to be read rather than presented. The first thing you'll notice is that they didn't try to just come up with a one-line mission statement. Those are probably still important, but you can't convey the necessary philosophy in a single line. When you go through this, you'll see a lot of very well-crafted maxims that were obviously lifted from combing through many of the more insightful business books.
What's exceptional about this is that it doesn't just say things like "people are our most important resource." As you've no doubt seen, this is easy to say and just as easy to casually discard when the quarterly profit statements dip. The problem is that there's ordinarily no way to call someone out and show that they aren't walking the talk, because there's no clear line that's being crossed. Without a test, you can always begin creeping (taking Weinberg's "one logical step at a time") until you end up in a bad place.
Let's look at a critical example: team members. With most companies, it seems like you have a position to fill, you interview until you find someone that seems to fit, then you can check the box next to that position and move on. If the person leaves or otherwise needs to be replaced, then you have to go through all that hassle again. Thus, if the person you hire is toxic to the team in any form, you tend to resist the hassle of changing the team member. In fact, most companies require a long process before you can fire a toxic member, and the whole time they are dragging the team down ... which is not something a team quickly recovers from. Just the discovery that you can be forced to work with a toxic person for such a long time because someone got impatient and short-circuited the hiring process causes significant damage.
Netflix attempts to change this scenario in two ways. First, they try to search for the right people all the time, rather than just when a box needs to be checked off. This changes the nature of hiring to something more like mushroom-hunting. Mushroom hunters are always keeping an eye open for mushrooms even when they're "just hiking." If they don't find any mushrooms on a particular hike, they don't bring back a bag of dirt just to bring something back. Other times they might discover an abundance of mushrooms when they're not particularly searching. Netflix appears to consider finding a team member a serendipitous event, not a task with a checkbox.
What's especially interesting is that they have a test for employees. This is not whether someone is toxic or not -- someone who gets through the process and turns out to be toxic is apparently removed at the earliest opportunity. The test is whether an employee is exceptional. The "Keeper Test" is clever because it relies partly on intuition: a manager thinks "Which of my people, if they told me they were leaving in two months for a similar job at a peer company, would I fight hard to keep at Netflix?" (I'm actually not clear on why "a similar job at a peer company" is important). Their goal, they say, is to have "stars in every position" so if you aren't someone that is worth fighting hard for, then you should get a generous severance now so that Netflix can try to find a star for that role (and, I observe, you don't bring down the energy of the team in the meantime -- just like someone who isn't engaged in an open-spaces discussion should go find a discussion where they can add energy).
The Netflix culture statement could be "Get and keep the best people, and enable them." And perhaps more important, "don't bog people down by accreting new rules everytime something happens." They say that the best thing about working there is that you get to work with exceptional people -- which if you think about it is the most compelling reason to go to a job (assuming you can get enough money to get by). As a consultant, what I miss most is working with stimulating people on a day-to-day basis, but when I see the state of most companies I'm glad I mostly work on my own. However, there are some consulting clients, and also experiences like writing First Steps in Flex with James Ward, where I come away saying "I'd like to do more things like this!" Netflix' goal of creating an environment where you get to work with exceptional people is very compelling, and it gives me some insight on why my friend Carl left Google to go there.
Do yourself a favor, especially if you're starting a new company, and go read their culture statement. It's undoubtedly imperfect and it modifies existing company structure rather than trying to reinvent the company (as I'd like to do) but it is filled with excellent insights and ideas.
It's not surprising we have this belief in the unchanging nature of things, to the point where we think we can build a system that doesn't change. Up until about 200 years ago (and really, about 100 years ago), you could live your whole life without anything changing, from birth until death. Moving to a heliocentric view of the universe was a giant change, but nowadays we casually throw Pluto out as a planet and discover that the bulk of matter in the universe is "dark." Big companies rise and fall within a few years instead of surviving for decades. And the consequences of our decisions -- intentional and unintentional -- can sometimes be seen within seconds instead of years. Without tests, you can't know whether you've strayed from your desired path (and yes, good tests are hard to create).
>I'm actually not clear on why "a similar job at a peer company" is important
I don't know either, though if I had to wager a guess, I'd say that they're not interested in trying to entice people to stick around who want a complete change of pace into another field. The guy I replaced at my current company decided to stop being an engineer to go into marketing. Even if my boss could have thrown more money at him to stay, would it have been worth it to keep someone around to do a job they no longer really want?
I disagree with the premise that specs without tests are meaningless (but I may misunderstood your article, it seems kinda messy, and don't quite get what your point was, anyway).
In mathematics, proof is always superior to experiment. In computer science, proof is equivalent to correctly written program, and experiment is equivalent to test of such program. Of course, correct specs doesn't equal correctly written program, since there is some kind of translation, but it can be close. Just like a mathematical paper with proof in it doesn't equal correct proof.
So, from this perspective, correct specs are not meaningless at all.
Also, let me offer a different perspective on testing. Testing is basically a reimplementation of part of the program. To have an automated test means to have a program that gives same results on some subset of inputs, and compare the results. To have a program completely covered by automated tests means to have a program that gives the same results on all inputs. In other words, it means to have just another implementation of the same problem.
This means that more you are trying to cover everything with your test, the more you are: 1. Trying to reimplement the program you are testing. 2. Facing the same problem, the correctness of the tests themselves.
Finally, I don't think the tests are the silver bullet. In fact, I think there are superior approaches to creating correct programs than testing, such as better abstractions when writing programs (consider compilers or Lisp macros) and assertions (also called preconditions/postconditions, contracts and so on).
I think this is based on the scientific principle that scientific theories must be testable. The worst thing a scientist can say about a theory is that it's unfalsifiable. In other words, it can't be disproved and therefore is not scientific. People often think that science is about proving things but it's not. It's about disproving things. The idea that theories can be proven has been long out of favor.
If you take that to the Netflix example, I think the idea is that if you propose a ethical rule, there must be some way to know if you've violated that rule. Otherwise, it's a pointless rule. That makes sense to me. For example, "don't be evil" sounds great but what does it really mean? What's the test for evilness?
In terms of code specifications, I think it's possible to write a spec that is so detailed that tests can be derived from it by anyone. However, few people write such specs and it's often hard to realize that there are vague parts of the spec until you try to test it or build code around it.
> > disproving things. The idea that theories can be > proven > > has been long out of favor. > Proof still works in mathematics.
That's the difference between mathematics and science. In math, you choose some axioms and prove theorems based on those axioms. In science, the intent is to determine the 'axioms' or fundamental laws of nature.
Computer science is, in a lot of ways, more similar to mathematics than science. There are many programs or subsets thereof that can be proven in the mathematical sense. On the other hand, it's also been proven that all programs cannot be proven to be correct.