The Artima Developer Community
Sponsored Link

Weblogs Forum
Testivus - Testing For The Rest Of Us

51 replies on 4 pages. Most recent reply: Jul 25, 2007 2:43 PM by Ashwath Akirekadu

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 51 replies on 4 pages [ « | 1 2 3 4 | » ]
James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Testivus - Testing For The Rest Of Us Posted: Feb 8, 2007 10:49 AM
Reply to this message Reply
Advertisement
> > Take your flocking-swarming example. How many unit
> tests
> > do you need to verify that the flocking-swarming works
> > properly under all operating conditions for all numbers
> of
> > entities from 2 to 2 million?
>
> This should be covered in our Manifesto. A TDD guru told
> me that in general, you test 0, 1, and "many". And that 2
> qualifies as "many". Seemed a bit strange to me at first,
> but it's a decent starting point.

Yeah it's great except of that the number of possible (simple) relationships for 10 entities is 3,628,800 times greater than for 2 entities.

This is exactly the kind of thing that bothers me about TDD. This 2 is 'many' is an absurd assumption. Maybe it's OK for basic unit testing but just hammers home the point that this is only the most superficial kind of verification. The idea that you can have 'total coverage' is a myth for any non-trivial program. Going back to my real-life example of recursive groupings, things didn't get interesting until you had at least 4 groups and the kind of testing that would relate to real-world usage was more like 25 groups. Not only was the number important, it was crucial that the result was deterministic without regard to ordering of the groups.

Part of why I'm bringing this up is that I haven't found any open-source automation frameworks for supporting this kind of testing. Maybe they exist. If they do, please tell me about them. I'm especially interested in automated regression testing.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Testivus - Testing For The Rest Of Us Posted: Feb 8, 2007 11:26 AM
Reply to this message Reply
> Yeah it's great except of that the number of possible
> (simple) relationships for 10 entities is 3,628,800 times
> greater than for 2 entities.

My math is bad on this but the point should still be clear.

Morgan Conrad

Posts: 307
Nickname: miata71
Registered: Mar, 2006

Re: Testivus - Testing For The Rest Of Us Posted: Feb 8, 2007 12:01 PM
Reply to this message Reply
> My math is bad on this but the point should still be clear.

James, I sortof agree with your point. What the TDD guru would say is that a sophisticated human algorithm tester would develop test cases to cover the 25 groups and check for determisism, etc... And that this smart testing is not "unit testing".

I do a lot of algorithm work myself. My approach is to feed in N sets of known data, run the algorithm, then compare the results to N known answers. In practice, N tends to be 1 or 2, which doesn't thrill me, but it still catches some bugs. The possible N is huge, you clearly can't even begin to cover it, so you try to do something reasonable...

Also, in practice, the "known answer" usually comes from the previous algorithm. Of course, you stare at the results to see that they look right, but you are really testing that the algorithm is no different than it was before. :-(

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Testivus - Testing For The Rest Of Us Posted: Feb 8, 2007 12:39 PM
Reply to this message Reply
> > My math is bad on this but the point should still be
> clear.
>
> James, I sortof agree with your point. What the TDD guru
> would say is that a sophisticated human algorithm tester
> would develop test cases to cover the 25 groups and check
> for determisism, etc... And that this smart testing is
> not "unit testing".

Right. It's not unit testing. I'm not saying that everyone who follows TDD does this but I get a distinct feeling that a lot of people think that unit testing is the most important thing. To me it's not really that important. Most of the time, if you do really good high-level testing you will catch the low level bugs. I understand that unit-testing makes it easier to identify the source of the bug but that's not my primary concern. It's finding the bugs and testing against my requirements that I am mostly worried about.

> I do a lot of algorithm work myself. My approach is to
> feed in N sets of known data, run the algorithm, then
> compare the results to N known answers. In practice, N
> tends to be 1 or 2, which doesn't thrill me, but it still
> catches some bugs. The possible N is huge, you clearly
> can't even begin to cover it, so you try to do something
> reasonable...
>
> Also, in practice, the "known answer" usually comes from
> the previous algorithm. Of course, you stare at the
> results to see that they look right, but you are really
> testing that the algorithm is no different than it was
> before. :-(

Different job, different example. I had to deal with this kind of situation where I had data that was coming in as flat-file data went through umpteen transformations and then was output as XML. I not only had to test that my changes were correct but also that nothing else was modified.

What I eventually ended-up with was a library of input files and associated outputs. I would run the code with say 50 input files and capture the outputs and compare them with the old results with a command line diff. Any differences were compiled into an output file. I added filters to ignore transient data like timestamps that were expected to change over time after I had eyeballed the results. Once the code moved to production, the new results were promoted to be the expected results. This didn't do everything but it freed me from the mechanics of comparing the results so that I could focus on crafting better tests. If a problem slipped through, a new test (usually based on the production data that caused the issue) would be added to the suite.

Creating the tests from scratch is time consuming. Coming up with things that will push the code to it's limits is hard. But this is precisely the kind of thing developers should be doing. That's why I would like to have a really good regression test tool so that I could stop puttering around with trying to run the tests and compare the results and get on with finding new ways to break my code.

Morgan Conrad

Posts: 307
Nickname: miata71
Registered: Mar, 2006

Re: Testivus - Testing For The Rest Of Us Posted: Feb 8, 2007 5:14 PM
Reply to this message Reply
Well, Testivus should try to address the issues of testing complicated algorithms. I'm not sure of the answer...


One other "testing for the rest of us" pattern I use a lot, though I think it is somewhat contrary to TDD Dogma. Let's say I have code to read and write my object. Don't care if it's to a database, XML, String, tab-delimited file, whatever. Let's use XML in this example. For simplicity, assume a helper object does the writing.

In general, I do not do


String theXML = myFooXMLHelper.writeToXML(myFoo);
// then test various tokens within theXML.


nor do I even do


String theXML = myFooXMLHelper.writeToXML(myFoo);
assertEquals("someLongString", theXML);



Why? The exact format of the XML (or database, etc...) is seldom a required spec for the object. Even if it were, I'd trust the XML parsers to make sure it matches the DTD, etc. They know far more about XML format than I do. But, to repeat, unless the funtional spec says "the XML shall follow this format because the users require that..." it is not worth testing the format.


What I do is


String theXML = myFooXMLHelper.writeToXML(myFoo);
MyFoo readItBack = myFooXMLHelper.readFromXML(theXML);
assertEquals(readItBack , myFoo);


The real requirement is that you get the same thing back. IMO, this approach is simpler to write, tests much more of the code, and far more robust. When you add a new field oopsWeForgotThis to MyFoo, you don't need to rewrite the test. You do need to rewrite equals() - which is a good side-effect.

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Testivus - Testing For The Rest Of Us Posted: Feb 8, 2007 5:48 PM
Reply to this message Reply
>
> String theXML = myFooXMLHelper.writeToXML(myFoo);
> MyFoo readItBack = myFooXMLHelper.readFromXML(theXML);
> assertEquals(readItBack , myFoo);
> 

>
> The real requirement is that you get the same thing
> back. IMO, this approach is simpler to write, tests much
> more of the code, and far more robust. When you add a new
> field oopsWeForgotThis to MyFoo, you don't need to rewrite
> the test. You do need to rewrite equals() - which is a
> good side-effect.

Morgan,

You just described a very useful type of tests that can be easily parametrized and ran with a lot of different inputs.

I call them "forall" type tests.

If the operation one is testing has an inverse operation, you can easily create such a test, as you've done in your example.

Generally speaking:

If an operation O(x) has an inverse operation I(x), such that, for all values of x, I(O(x)) == x. It makes sense to put that equality inside a loop and execute with a lot of different values for x.

But I have found that, in most cases, you still need some specific test examples to make sure that both operations are not broken in a way that, when combined, they cancel each other out.

Here's a specific example:

If you have to test a square root and a square method:

class MyMath {
...
   public static double sqrt(double n) { ... }
   public static double square(double n) { ... }
...


You can use the forall technique and write a test like this:

public void testSqrtAndSquare() {
	Random rand = new Random();
	double aNum;
	for (int i=0; i<1000; i++) {
		aNum = rand.nextDouble();
		assertEquals(aNum, Math.sqrt(MyMath.square(aNum)));
		}
	}


This is test is a nice confidence builder. It's not a test, it's one thousand tests :-) ... until you realize that both 'sqrt' and 'square' could be implemented to simply return 'n' and the tests would still pass.

So you should also have some more specific/traditional tests, such as:

public void testSqrtAgain() {
	assertEquals(2.0, MyMath.sqrt(4.0));
        assertEquals(16.0, MyMath.square(4.0));
        ...
}


To cover all your bases.

Alberto

disney

Posts: 35
Nickname: juggler
Registered: Jan, 2003

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 2:02 AM
Reply to this message Reply
> > While it's true "you can only write trivial tests in a
> > matter of minutes", it's sometimes surprising how well
> you
> > can test your code with quite a few such tests. Just as
> > you can generate flocking/swarming behaviour with two
> > or three simple rules.
>
> I think a lot of code can be tested this way and that it's
> a great way to test that kind of code. My point is that
> this one size-fits-all model for testing doesn't make a
> lot of sense to me.

Agreed, but I don't think anyone has claimed that a particular test, or way of testing, produces *guaranteed* correct code. In general: tested code has less bugs than untested code, therefore testing makes sense.

> Take your flocking-swarming example. How many unit tests
> do you need to verify that the flocking-swarming works
> properly under all operating conditions for all numbers of
> entities from 2 to 2 million?
>
> Again, I'm not arguing that unit-testing isn't helpful,
> I'm arguing that it isn't sufficient and given that it
> isn't sufficient, questioning whether putting the effort
> in to do it for things where it isn't sufficient an
> efficient use of time?

Aren't you just saying that "testing falls short of perfection, so why bother?"? Isn't the deciding factor whether or not testing offers an improvement over no testing? And if you're going to consider time efficiency, surely you need to compare testing with some alternative, and compare the improvements to the code that each makes, given the same investment of time?

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 6:28 AM
Reply to this message Reply
> Aren't you just saying that "testing falls short of
> perfection, so why bother?"? Isn't the deciding factor
> whether or not testing offers an improvement over no
> testing? And if you're going to consider time efficiency,
> surely you need to compare testing with some alternative,
> and compare the improvements to the code that each makes,
> given the same investment of time?

Let me make this absolutely clear, I am not advocating that you do no testing. If a developer is calling untested code (or poorly tested code) complete, that developer is (in my eyes) either a rookie or incompetent.

What I am saying is that I feel that there is a dogma around unit tests that says everything must have unit tests. I see things written like 'you need about 4 unit tests for each line of code'. There is a lot of code that you can test really well with unit tests. But there are also a lot of things that you cannot test well with unit tests. It doesn't make sense to waste time writing unit tests for these things. A different approach is needed.

There is so much focus on unit testing. Where are the test harnesses for automated regression/high-level testing? I'm seriously hoping someone will prove me wrong and point me to a good open-source tool because I really need one right now.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 6:46 AM
Reply to this message Reply
> You can use the forall technique and write a test like
> this:
>
> (snipped code)
>
> This is test is a nice confidence builder. It's not a
> test, it's one thousand tests :-) ... until you realize
> that both 'sqrt' and 'square' could be implemented to
> simply return 'n' and the tests would still pass.
>
> So you should also have some more specific/traditional
> tests, such as:
>
> (snipped code)
>
> To cover all your bases.
>
> Alberto

See this is the kind of thing that unit tests are terrible for. I would do something like create a tab delimited file that contains values and their squares. In addition you can have a random tester.

The problem with this example is that it's too simplistic. The kind of thing I have had to test would produce outputs that were many KB long. Some of the elements were expected to change (e.g. the current time and date) some were not. In addition, you have to deal with things like outputs that are unordered like in XML where the order of siblings of the same type is meaningless.

Morgan Conrad

Posts: 307
Nickname: miata71
Registered: Mar, 2006

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 9:03 AM
Reply to this message Reply
@James

You make several good points. I have some responses, not all are perfect solutions but they might be relevant.

> The problem with this example is that it's too simplistic.

Most of the XML I work with is relatively simple, so you are probably right here.


> Some of the elements
> were expected to change (e.g. the current time and date)
> some were not.

Arguably, equals() would ignore these. Or you could have a method equalsExceptForTheVolatileStuff() that would be useful to the program. I often have a method that compares complex data structures and returns flags corresponding to which sections have changed. You could test that only the time and date have changed.

One other trick with times - sometimes I add an argument where I can pass in a time (null means use new Date() to get the current time). In production code, you usually pass in null, but in test code, it's traditional to pass in something like a birthday. This might be something Spring could do for you too with e.g. a DateFactory. Yes, this causes minor changes to the "real" code, so it isn't perfect.


> In addition, you have to deal with things
> like outputs that are unordered like in XML where the
> order of siblings of the same type is meaningless.

I had a case like this (it wasn't XML, but it was a bunch of children with numeric values). In the test code I explicitly sorted them so that the comparison worked.

J. B. Rainsberger

Posts: 12
Nickname: jbrains
Registered: Jan, 2004

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 9:28 AM
Reply to this message Reply
Alberto: in general, I agree with your sentiment; however, I do want to point to one part of it that I find a little troubling.

> <h2>Less testing dogma, more testing karma</h2>

The one problem I have with this statement is the people who confuse "dogma" with "suggestion". Quite often, when I espoused TDD as a design technique, or defended JUnit's "limitations" as having good influence on my design, people accused me of being dogmatic. It seems to be the "silver bullet" argument designed to shut down any suggestion that difficulty testing indicates a design problem worth exploring. While I agree that suggestions made with little experience and with little understanding of the other person's situation can be dogmatic, suggestions made with thorough experience and careful consideration of the other person's situation is most certainly not dogmatic. I suppose it's not always easy to tell the difference.

So while I agree with the idea of less dogma, more karma, I'd like to add a footnote: "Be sure you can tell the difference between advice and dogma."

Take care.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 9:57 AM
Reply to this message Reply
> Arguably, equals() would ignore these. Or you could have
> a method equalsExceptForTheVolatileStuff() that would be
> useful to the program. I often have a method that
> compares complex data structures and returns flags
> corresponding to which sections have changed. You could
> test that only the time and date have changed.

Well, Java code was only one part of this. Really I had to verify the output from perspective of an external entity so in effect I was capturing data from a stream. Some of the components were written by us, some were off the shelf. I had to verify the whole thing because all that mattered was that the data the external entities received was correct.

I could have written a bunch of Java code to do checks but I would never have finished. I was working with dozens of schemas that often had multiple versions.

What I did instead use pretty-printers to create output files in a consistent format (indention, rendering xml-encoded xml etc.) and eventually I created one that would sort siblings deterministically. I'd run the process with canned inputs (usually XML or flat-files) write the outputs in this cannonical form and run a command-line diff against all the outputs against all the expected outputs using a srcipt into an output file. I'd then filter out the the diff program graffiti to get just the changed data. I could run hundreds of inputs this way and if nothing had changed, I could know in a minute of so. If there were supposed to be changes I could verify they were correct much more quickly.

It worked but it wasn't completely generic. Oh yeah, the other thing that the TDD movement ignores is people who are maintaining huge codebases with no existing tests. 'Test-first' is only going to work in this situation when the time-machine is invented.

> One other trick with times - sometimes I add an argument
> where I can pass in a time (null means use new Date() to
> get the current time). In production code, you usually
> pass in null, but in test code, it's traditional to pass
> in something like a birthday. This might be something
> Spring could do for you too with e.g. a DateFactory. Yes,
> this causes minor changes to the "real" code, so it isn't
> perfect.

The timestamps were part of the requirements and I'm adamantly against modifications or additions that will not be part of the production code for testing. I don't mean mock objects and stuff like that. If you have to modify your code after the test, your test is almost meaningless.

> > In addition, you have to deal with things
> > like outputs that are unordered like in XML where the
> > order of siblings of the same type is meaningless.
>
> I had a case like this (it wasn't XML, but it was a bunch
> of children with numeric values). In the test code I
> explicitly sorted them so that the comparison worked.

Right, that's the kind of thing I'd like to have in a testing harness.

Bill Burris

Posts: 24
Nickname: billburris
Registered: Aug, 2003

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 10:19 AM
Reply to this message Reply
A while back when I was doing some thinking about how to test my code I wrote up my ideas on Ward's Wiki:
http://c2.com/cgi/wiki?BillBurris

The main idea here was that I had two types of tests, unit tests and functional test. The unit tests are your typical simple tests that run in NUnit etc. The functional tests were complete programs written to test various libraries that are being developed as part of my application. So unit tests typically test a single function. Functional test are for everything else that is not easy to test in your unit test framework.

All the tests unit and functional get compiled every time the the application is compiled. The unit tests are run with each recompile. The functional test are just run occasionally.

When trying to add multi-threaded testing into the mix I posted the results on my website at:
http://www.componentsnotebook.com/notebooks/cpp/default.aspx

All those ideas got put on the shelf, when I switched to doing circuit board layouts and FPGA code for a while. Now that I am back to software development I reverted to my old way of doing things and just started writing code with run and crash testing.

Bill
Bill

Cedric Beust

Posts: 140
Nickname: cbeust
Registered: Feb, 2004

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 11:06 AM
Reply to this message Reply
> > So you should also have some more specific/traditional
> > tests, such as:
> >
> > (snipped code)
> >
> > To cover all your bases.
> >
> > Alberto
>
> See this is the kind of thing that unit tests are terrible
> for. I would do something like create a tab delimited
> file that contains values and their squares. In addition
> you can have a random tester.

I would rephrase this: it's not something that unit tests are terrible for, it's something that data-driven tests are perfect for.

And data-driven tests can be either unit or functional tests (see @DataProvider at http://testng.org).

--
Cedric
http://testng.org

Alberto Savoia

Posts: 95
Nickname: agitator
Registered: Aug, 2004

Re: Testivus - Testing For The Rest Of Us Posted: Feb 9, 2007 12:54 PM
Reply to this message Reply
> Alberto: in general, I agree with your sentiment; however,
> I do want to point to one part of it that I find a little
> troubling.
>
> > <h2>Less testing dogma, more testing karma</h2>
>
> The one problem I have with this statement is the people
> who confuse "dogma" with "suggestion". Quite often, when I
> espoused TDD as a design technique, or defended JUnit's
> "limitations" as having good influence on my design,
> people accused me of being dogmatic. It seems to be the
> "silver bullet" argument designed to shut down any
> suggestion that difficulty testing indicates a design
> problem worth exploring. While I agree that suggestions
> made with little experience and with little understanding
> of the other person's situation can be dogmatic,
> suggestions made with thorough experience and careful
> consideration of the other person's situation is most
> certainly not dogmatic. I suppose it's not always easy to
> tell the difference.
>
> So while I agree with the idea of less dogma, more karma,
> I'd like to add a footnote: "Be sure you can tell the
> difference between advice and dogma."
>
> Take care.

Hi J.B.,

Great to hear from you. I "sort-of" see your point. I say "sort-of" because I believe that most people can differentiate between dogma and well-informed recommendations/suggestions.

Taking your book "JUnit Recipes" and your various posts as a example, it's very clear to me (and I believe to everyone who reads it) that you are offering options and suggestions. The advice is in the form: "X works best for Y" or "Consider Z", or "If you do A, it will make testing easier." I don't think I've ever heard you say things like "this is the only way".

But, and I won't name names, there are a lot of people out there who claim that the only way to achieve "salvation" through testing is to practice X, only X, and all of X - all the time. That's what I consider dogma and that's what turns a lot of people off the idea of developer testing.

I am actually surprised that most people did not object to the phrasing of "Less testing dogma, more testing karma".

After writing it I realized that, perhaps, less should have been replaced by no: "No testing dogma, just testing karma."

Should there be any dogma in Testivus?

I believe so, a movement without at least one code tenet is a movement about nothing - to stay with the Seinfeld theme.

Perhaps the only dogma is that "developers should take responsibility for testing their own code."

Thoughts?

Flat View: This topic has 51 replies on 4 pages [ « | 1  2  3  4 | » ]
Topic: Testivus - Testing For The Rest Of Us Previous Topic   Next Topic Topic: What Will the iPhone Mean for Mobile Development?

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use