I mostly agree, especially about Beizer’s Pesticide Paradox and the false sense of thoroughness. (At a previous employer, I twice saw large test suites for connected products which would mostly pass even when disconnected.)
But I do disagree with making no assumptions about what the developers did. To paraphrase and extend what you’ve said elsewhere, “You’ve got to know the territory.” The most important part of the territory is the customers’ needs and habits, of course; but it’s worth knowing developers’ strengths and weaknesses as well, at least when deciding what to test first.
To take one sadly rare example, I once had the privilege of working with a developer who was very good at the hard part of his job. So I focussed on the easy, obvious stuff, in this case going down the feature list and turning it into high-level test cases. Sure enough, he’d let a couple of obscure missing features slip through. Obviously I was obliged to have some test cases for the hard stuff as well, but in retrospect I should have spent less time on them and more on the easy stuff.
More mundanely, the Eighty/Twenty Rule applies to developers with a vengeance: Areas owned by your worst developers deserve a lot more attention than areas covered by the best. (One of static analysis’s many advantages is that it can make this painfully apparent.)
Obviously you shouldn’t go overboard in following this advice. Even the best developers and the most solid areas of code need a reasonable amount of coverage, and your best developer may be replaced by somebody who needs careful scrutiny. But testing in a fallen world is all about prioritizing, and one (but only one) of the factors in deciding what to test soonest and most is whatever you can learn about your developers. Asking them their opinions is always worthwhile, but it’s not the only means of communication. I’ve gotten some of my best test cases from internal chat rooms, for instance. And watching their check-in comments can be informative as well as amusing.
Agree with the pesticide principle. This is a real danger where test suites are built-up over time, extended or modified without some kind of objective re-analysis.
It's true that shaking up the order and input data will make a few bugs fall out of the tree, but more is needed.
A periodic re-assessment is required. These are costs that need taking into account when the longevity of a test case is considered (during the test design phase) - is this test case going to be useful for several projects (versions of the product/solution?) then what should bugdet should be assigned for its periodic review...
Agree with some other comments that going over some territory that the developers have covered is necessary.
But the bigger picture is that the test phases should be complimentary - so you can (up-front) estimate/guestimate how much overlap the project needs (and what that will cost...)
"Even worse, all that so-called successful testing will give us a false sense of thoroughness and make our completeness metrics a pack of very dangerous lies."
It brings to mind an analogy I like even more than "pesticide" regarding the dangers of repeatedly testing the same things in the same way time after time:
“Highly repeatable testing can actually minimize the chance of discovering all the important problems, for the same reason stepping in someone else’s footprints minimizes the chance of being blown up by land mine.” - James Bach
(Shared with a tip of the hat to Matt Archer who recently included this memorable quote his blog).
- Justin
__________________ Justin Hunter Founder and CEO Hexawise http://www.hexawise.com "More coverage. Fewer Tests."
One of the main issues with repetitive testing is not realizing that tests expire. It comes back to test design. Well designed tests will have a better chance at being useful in the long run. Poorly designed tests they should never see the light of day. An iterative process is needed to reevaluate existing test cases to retire those not providing value.
An approach to software automation implemented by team I worked with was to automate high priority bugs. They had a robust automation suite after a couple of years. Each time a test would fail there was a reevaluation of it to verify its validity.
I agree with the pesticide paradox. However, on most projects when the application is in the maintenance phase, not many resources are going to be put into testing. Developing new tests takes resources. The priority is testing bug fixes and new features. In risk based testing, testers focus on high risk areas. In the Six Sigma strategy, we create Risk Priority Numbers for tasks then we work on the important tasks and pay little attend to the low priority tasks. Nevertheless, IMHO, convincing people to devote more resources to testing in maintenance phase of a project is a tough row to hoe.
I agree with your comments and we need to evaluate our test cases time to time otherwise they will grow and difficult to maintain. In our case also we have thousands of cases and we run and maintain daily.Somewhere testers may provide the importance to unit cases by developers and ignore this area. But then that brings to another popular thing, don't assume anything. In the end blame comes to tester only and it is difficult to leave any area.
Hmmm - I would also venture to add to the pesticide example that there's a problem with testers using pesticides that don't target the bugs they are looking to find and kill.
I can't tell you how many times I've seen people running tests that do not test what they are claiming they will.
It's not that the bugs grow immune but that the test techique has already cleared the area to which the technique has been applied.
What now needs to happen is to continue to clear the other areas in the product.
If a technique is not exposing new bugs, then one needs to dig into quiver to pull out and try the next technique. Better yet, if you have the resources, use a multitude of techniques concurrently. You can clear wider swaths of area that way.
And hey, if you've tried everything in your repertoire and are still not finding bugs, then maybe it's time to ship? :)
Even though I completely agree with the aspect of keeping test suites updated, whenever I start making an effort in this direction (especially of quite old suites), I encounter a contrasting thought within myself: Let's say test suites have been updated and the 'new' test cases find 'new' bugs in a current product under validation. But these so called 'new' bugs are 'base code bugs' since conceptually they are present in previous products (which have been shipped) as well given that the previous products' "code" is taken as base for the current version of the product upon which new features are being added and no 'customer' has ever made a complaint of such 'old base/new' bugs, so should we fix them now (Dev team) or should we worry about finding them now (Test team)... Probably this thought/situation is more common in a product-based company/work environment...
Given that every 'fix' is an expense, do we take an economic view and don't find such bugs (in other words, don't update the test suites once they get 'stable') or do we take a 'purist' approach and no matter find all bugs whenever we can irrespective of whether they be fixed or not...any thoughts please?
You can measure the ratio of bugs found by fixed and variant testing. I guess you do.
When I have measured the fixed:variant bug-finding effectiveness, I have always found variant testing to be better at finding bugs on mature codebases.
That seemed natural. The code paths and system states exercised by the fixed tests get debugged over time.
I mostly agree, especially about Beizer’s Pesticide Paradox and the false sense of thoroughness. (At a previous employer, I twice saw large test suites for connected products which would mostly pass even when disconnected.)
ReplyDeleteBut I do disagree with making no assumptions about what the developers did. To paraphrase and extend what you’ve said elsewhere, “You’ve got to know the territory.” The most important part of the territory is the customers’ needs and habits, of course; but it’s worth knowing developers’ strengths and weaknesses as well, at least when deciding what to test first.
To take one sadly rare example, I once had the privilege of working with a developer who was very good at the hard part of his job. So I focussed on the easy, obvious stuff, in this case going down the feature list and turning it into high-level test cases. Sure enough, he’d let a couple of obscure missing features slip through. Obviously I was obliged to have some test cases for the hard stuff as well, but in retrospect I should have spent less time on them and more on the easy stuff.
More mundanely, the Eighty/Twenty Rule applies to developers with a vengeance: Areas owned by your worst developers deserve a lot more attention than areas covered by the best. (One of static analysis’s many advantages is that it can make this painfully apparent.)
Obviously you shouldn’t go overboard in following this advice. Even the best developers and the most solid areas of code need a reasonable amount of coverage, and your best developer may be replaced by somebody who needs careful scrutiny. But testing in a fallen world is all about prioritizing, and one (but only one) of the factors in deciding what to test soonest and most is whatever you can learn about your developers. Asking them their opinions is always worthwhile, but it’s not the only means of communication. I’ve gotten some of my best test cases from internal chat rooms, for instance. And watching their check-in comments can be informative as well as amusing.
Hi,
ReplyDeleteAgree with the pesticide principle. This is a real danger where test suites are built-up over time, extended or modified without some kind of objective re-analysis.
It's true that shaking up the order and input data will make a few bugs fall out of the tree, but more is needed.
A periodic re-assessment is required. These are costs that need taking into account when the longevity of a test case is considered (during the test design phase) - is this test case going to be useful for several projects (versions of the product/solution?) then what should bugdet should be assigned for its periodic review...
Agree with some other comments that going over some territory that the developers have covered is necessary.
But the bigger picture is that the test phases should be complimentary - so you can (up-front) estimate/guestimate how much overlap the project needs (and what that will cost...)
"Even worse, all that so-called successful testing will give us a false sense of thoroughness and make our completeness metrics a pack of very dangerous lies."
ReplyDeleteWell said!
James,
ReplyDeleteVery good post.
It brings to mind an analogy I like even more than "pesticide" regarding the dangers of repeatedly testing the same things in the same way time after time:
“Highly repeatable testing can actually minimize the chance of discovering all the important problems, for the same reason stepping in someone else’s footprints minimizes the chance of being blown up by land mine.” - James Bach
(Shared with a tip of the hat to Matt Archer who recently included this memorable quote his blog).
- Justin
__________________
Justin Hunter
Founder and CEO
Hexawise
http://www.hexawise.com
"More coverage. Fewer Tests."
One of the main issues with repetitive testing is not realizing that tests expire. It comes back to test design. Well designed tests will have a better chance at being useful in the long run. Poorly designed tests they should never see the light of day. An iterative process is needed to reevaluate existing test cases to retire those not providing value.
ReplyDeleteAn approach to software automation implemented by team I worked with was to automate high priority bugs. They had a robust automation suite after a couple of years. Each time a test would fail there was a reevaluation of it to verify its validity.
This comment has been removed by the author.
ReplyDeleteI agree with the pesticide paradox. However, on most projects when the application is in the maintenance phase, not many resources are going to be put into testing. Developing new tests takes resources. The priority is testing bug fixes and new features. In risk based testing, testers focus on high risk areas. In the Six Sigma strategy, we create Risk Priority Numbers for tasks then we work on the important tasks and pay little attend to the low priority tasks. Nevertheless, IMHO, convincing people to devote more resources to testing in maintenance phase of a project is a tough row to hoe.
ReplyDeleteI agree with your comments and we need to evaluate our test cases time to time otherwise they will grow and difficult to maintain. In our case also we have thousands of cases and we run and maintain daily.Somewhere testers may provide the importance to unit cases by developers and ignore this area. But then that brings to another popular thing, don't assume anything. In the end blame comes to tester only and it is difficult to leave any area.
ReplyDeleteHmmm - I would also venture to add to the pesticide example that there's a problem with testers using pesticides that don't target the bugs they are looking to find and kill.
ReplyDeleteI can't tell you how many times I've seen people running tests that do not test what they are claiming they will.
It's not that the bugs grow immune but that the test techique has already cleared the area to which the technique has been applied.
ReplyDeleteWhat now needs to happen is to continue to clear the other areas in the product.
If a technique is not exposing new bugs, then one needs to dig into quiver to pull out and try the next technique. Better yet, if you have the resources, use a multitude of techniques concurrently. You can clear wider swaths of area that way.
And hey, if you've tried everything in your repertoire and are still not finding bugs, then maybe it's time to ship? :)
Hi James, nice to hear you again :-)
ReplyDeleteEven though I completely agree with the aspect of keeping test suites updated, whenever I start making an effort in this direction (especially of quite old suites), I encounter a contrasting thought within myself: Let's say test suites have been updated and the 'new' test cases find 'new' bugs in a current product under validation. But these so called 'new' bugs are 'base code bugs' since conceptually they are present in previous products (which have been shipped) as well given that the previous products' "code" is taken as base for the current version of the product upon which new features are being added and no 'customer' has ever made a complaint of such 'old base/new' bugs, so should we fix them now (Dev team) or should we worry about finding them now (Test team)... Probably this thought/situation is more common in a product-based company/work environment...
Given that every 'fix' is an expense, do we take an economic view and don't find such bugs (in other words, don't update the test suites once they get 'stable') or do we take a 'purist' approach and no matter find all bugs whenever we can irrespective of whether they be fixed or not...any thoughts please?
You can measure the ratio of bugs found by fixed and variant testing. I guess you do.
ReplyDeleteWhen I have measured the fixed:variant bug-finding effectiveness, I have always found variant testing to be better at finding bugs on mature codebases.
That seemed natural. The code paths and system states exercised by the fixed tests get debugged over time.
You know all that. So why the metaphor?