By James A. Whittaker
By James A. Whittaker

Let me be clear that these plague posts are specifically aimed at pointing out problems in testing. I'll get around to the lore and the solutions afterward. But keep up the debate in the comments, that's what this blog is all about.

The plague of repetitiveness

If aimlessness is the result of ‘just doing it’ then repetitiveness is the result of ‘just doing it some more.’ Pass after pass, build after build, sprint after sprint, version after version we test our product. Developers perform reviews, write unit tests and run static analyzers. But we have little insight into this effort and can't take credit for it. Developers test but then we retest. We can’t assume anything about what they did so we retest everything. As our product grows in features and bug fixes get applied, we continue our testing. It isn’t long until new tests become old tests and all of them eventually become stale.


This is Boris Beizer’s pesticide paradox. Pesticide will kill bugs, but spray the same field enough times with the same poison and the remaining bugs will grow immune. Rinse and repeat is a process for cleaning one’s hair, not testing software. The last thing we need is a build full of super-bugs that resist our ‘testicide.’ Even worse, all that so-called successful testing will give us a false sense of thoroughness and make our completeness metrics a pack of very dangerous lies. When you aren't finding bugs it's not because there are no bugs, it's the repetition that's creating the pesticide paradox.


Farmers know to adjust their pesticide formula from time to time and also to adjust the formula for the specific type of pest they expect in their fields. They do this because they understand the history of pesticide they used and know they can't get by with brute force repetition of they same old poison. Testers must pay attention to their test results too and watch for automation that isn’t adding value. A healthy injection of variation into automation is called for. Change the order of tests, change their data source, find new environments, modify input values do something the bugs aren’t prepared for.

By James A. Whittaker
By James A. Whittaker

Yes I am going to be speaking at GTAC, thanks for asking. Frankly, I can't wait. I spoke at a Swiss testing conference and at the University of Zurich a couple years ago and I enjoyed the area and the people a lot. Excellent food, good beer and lots of cool European history and quaint back streets to wallow in. I hope to see you there.

Speaking of speaking, I just finished giving my first internal tech talk this past week. I spoke about the 7 Plagues of Software Testing and received a lot of input from Googlers about them. I'm encouraged enough that Googlers found them thought provoking that I've decided to broaden the conversation by posting them here as well. My plan at GTAC is to give you details on how Google is addressing these plagues in our own testing and hopefully you'll be willing to share yours as well. One plague per post lest this blog be quarantined ...


The Plague of Aimlessness


Lore. It’s more than just a cool word. It conjures up a sorcerous image in one’s mind of ancient spell books and learned wizards with arcane and perilously attained knowledge.


And it’s exactly what we lack in software testing. Testing lore? Are you kidding me? Where is it? Who’s bogarting it? Can I have a hit?


The software testing industry is infected with the plague of aimlessness. We lack lore; we lack a body of knowledge that is passed from wizard to apprentice and written down in spell books for the industrious to study. Our apprentices are without their masters. We must all reinvent the wheel in the privacy of our offices only to have other testers the world over reinvent it in theirs.


I suggest we stop this nonsense. Testing is far too aimless. We test because we must or our manager tells us to do so. We automate because we can or because we know how to, not because it is part of some specific and proven strategy and certainly not because our lore dictates it. Is there a plan or some documented wisdom that guides our testing or are we just banging on the keyboard hoping something will fail? Where are the testing spell books? Surely the perilously attained knowledge of our tester forebears is something that we can access in this age of readily available information?


When a hunter makes a kill, they remember the terrain and circumstances. They pass this knowledge on to their successors. Over time they understand the habits of their prey and the collective knowledge of many hunters makes the job of future hunters far easier. When you see this terrain, you can expect game to behave in this manner. Can we say the same of testing? How well do we learn from each other? Do our ‘eureka moments’ get codified so that future testers will not have to suffer the aimless thrashing that we suffered? Can we say when you see functionality like that, the best way to test it is like this?


The plague of aimlessness is widespread. The need for testing lore is acute. Nike tells us to ‘just do it’ but what applies to exercise is not good advice for software testing. The next time you find yourself ‘just doing’ testing, pause for a moment and ask yourself ‘what is my goal?’ and ‘is there a purpose to this test?’ If the answer doesn’t immediately come to mind, you’re aimless, just doing it, and relying on luck and the sheer force of effort to find your quarry.


Luck has no place in sorcery or hunting and it has no place in testing. Luck is a nice happenstance, but it cannot be our plan A. Watch for the plague of aimlessness. Document your successes, scrutinize your failures and make sure you pass on what you learn from this introspection to your colleagues.


Be their wizard. Build a testing spell book and share it with others on your team. Over time you’ll banish the plague of aimlessness.

Juergen Allgayer - GTAC Conference Chair

GTAC 2009: Testing for the web

The 4th Google Test Automation Conference brings together a selected set of industry practitioners around the topic of software testing and automation. This annual conference provides a forum for presentations and connects professionals with each other. To increase outreach, presentations are published online for everybody to see.

Juergen Allgayer - GTAC Conference Chair

GTAC 2009: Testing for the web

The 4th Google Test Automation Conference brings together a selected set of industry practitioners around the topic of software testing and automation. This annual conference provides a forum for presentations and connects professionals with each other. To increase outreach, presentations are published online for everybody to see.

This years theme is Testing for the Web, topics may include:

  • Testing the UI of modern web applications (HTML5, Ajax)
  • Testing applications on mobile devices
  • Testing in the cloud
  • Web testing tools (Selenium, Webdriver and co)
  • Testing distributed asynchronous applications
  • Testing for web browser compatibility
  • Testing large storage systems
  • Load and performance testing
  • Finding and reproducing failures that matter
  • It seemed like a good idea (things you expected to work, but that didn't)

Presentations are targeted at experienced engineers actively working on problems of quality, test automation and techniques, but also include students and academics. We encourage innovative ideas, controversial experiences, problems, and solutions that further the discussion of software engineering and testing. Presentations are 45 min in length and speakers should be prepared for an active question and answer session following their presentation. While ideas are good, ideas refined by experience are even more interesting to participants at GTAC.

The conference is a two day event comprised of a single track of talks. Our philosophy is to engage a small set of active participants who all experience the same topics carrying the discussions into lightning talks, speaker Q&A, and topical discussion groups. Each year we have worked to identify a location that has a unique profile of technology professionals. This year the conference will be held at the Google office in Zurich, Switzerland on October 21 and 22, 2009.


Submission of Proposals

Please email a detailed and extended abstract (one page at most) to
gtac-2009-cfp@google.com. Your submission must include the name of topic, author(s), affiliation, and an outline of the proposed talk. We strongly recommend you to also submit one or two highlight slides of the talk. Submit your proposal before August 1, 2009. We will acknowledge reception within one business day. Where employer or disclosure authorization is needed, authors need to obtain it prior to submitting. The program committee will evaluate proposals based on quality and relevance. All submissions will be held confidentially prior to contacting the selected presenters.


Notification of Acceptance
Notification of acceptance will be sent out on or before August 8, 2009. Authors of accepted proposals will present at the conference and their talk will be made available to the public on YouTube.


Copyright

GTAC requires authors to present at the conference and permit their presentation to be made available on YouTube.

Attendees
To ensure active participation and provide a variety of technical perspectives, we select applying attendees. Further information will be published via a call for participation at a later time.


Important Dates
August 1 - Deadline for presentation proposals

August 8 - Notification of acceptance

October 21+22 - GTAC conference (Zurich, Switzerland)


Questions

If you have questions regarding the submission process or potential topics please email us at:
gtac-2009@google.com

We will add more information to the Google Testing Blog as we get closer to the dates.



We have already received several inquiries about this year's GTAC - thanks for your enthusiasm, here's the news you've been waiting for - we will host the GTAC 2009
We have already received several inquiries about this year's GTAC - thanks for your enthusiasm, here's the news you've been waiting for - we will host the GTAC 2009 October 21 and 22 at the Google offices in Zurich, Switzerland.

As with previous years, the focus of the conference will be on solving software engineering challenges using tools and automation. This year will have a special focus on "Testing for the Web". We are looking forward to getting together to sharing lessons learned and practical experience testing web apps, services, and systems. We are also encouraging a discussion on effectively testing apps and services for mobile devices.

We will have a call for proposals coming out very soon - watch this space!


One of the strengths of the conference is that it's driven by a peer group and vocal participation. As in previous years, GTAC is an invitation only conference to share great ideas and to have your thoughts challenged and refined. When you apply, we want you to tell us what ideas and questions you'll bring to the conference, and how you can further the discussion. We will open the application process in late July 2009.


Please send suggestions, questions and recommendations to: gtac-2009@google.com
or post your comments here to this blog.

By James Whittaker
By James Whittaker

One of the best parts about change is meeting new people and I've met a lot of Googlers in Mountain View and Kirkland over the past two weeks. There are many burning questions we've discussed but one has surfaced so much that it has to take top billing: manual vs. automated testing.

Google, I've learned, has come full circle on this issue. When the company was new and Search was king most testing was manual and like a lot of startups there was little focus on actual QA. In recent years the pendulum has swung to automation with developers writing a lot of unit tests and testers creating automation frameworks prolifically.

And now with my recent work on manual testing making the rounds what will throwing me into this mix produce?

Actually, I'd like to put the manual vs. automation debate to bed. Instead, I think the issue is test design versus doing something because we can. How much automation is written not because there is a clear need but because we can? Or, perhaps, because we think we must? Hmm.

Before you cry bias, how much manual testing is seat of the pants versus intentional, purposeful testing?

See the point? Whether you are talking about manual or automated testing, it's test design ... identifiying what needs testing and what is the best way to test it ... that has to take center stage. Too often we are solution focused and marry ourselves to our automation rather than the product we are testing. Whenever I see testers that seem more vested in their automation than in the product they are shipping I worry the pendulum has swung too far. Automation suffers from this because there is a product - the tool - that can become the object of obession. Manual testers don't really produce such a physical baby they can fuss over or they probably would fuss just as much as the automaters. Ignore the tool, focus on the problem it should solve!

Besides, I think fundamentally that manual and automated testing are irreversibly linked. All good automation begins it's life as manual test cases. And to create really good automation, you have to be good at manual testing. It's during manual testing that a tester gets good at test design, identifying important testing problems and crafting the solution approach. Any tester who isn't good at test design should think twice before they automate.

Let's shift the debate to good test design. Whether those tests ultimately end up being executed manually or via a tool is a moot point, let's just make them good ones that solve real testing problems.

How many octomoms does the testing world need anyway?

By James Whittaker
By James Whittaker

Here I am. Thanks for all the inquiries.

Why the change to Google? I’ve been asked that over and over again. As I reflect on the whole process, I have to admit that I like change. I like the challenge it brings, the creativity it sparks and the potential that I might fail at some new endeavor is simply intoxicating. Change, I think, is good. After all, if Robert Plant and Jimmy Page had never ventured beyond their comfortable British borders, they would have never written Kashmir and the planet is far better off for having that song.


My first week at Google has been a whirlwind of activity. I had the distinction of being (at a ripe of old age of 43) the oldest person at new employee orientation. I passed a billionaire in the hallway. I sat in a room with some of the best testing minds in Silicon Valley and walked across campus with a young engineer whose biggest problem is that she can’t learn enough fast enough. I’ve signed dozens of books.


There’s much to learn and much to do. I’ll catalog the results here if anyone is interested in following it. Coming from a company like Microsoft, I am used to mind-bogglingly complex problems and comfortable with partial solutions that point toward a better but still imperfect future. My role at Google will be to continue to thwart the impossible. Innovation as a main course is what brought me here. I hope to continue my work on testing tours and envisioning the future, but I am even more excited about the things I can’t yet see. Given the team that I am working with here, I think it is safe to expect big things.


In case you are interested, I am located in the Kirkland WA office and not yet assigned to a project. If I am lucky I will manage to get my hands into everything. I’ll try and be careful not to spill the secret sauce over my nicest shirt…

By Patrick Copeland

I'm excited to announce that James Whittaker has joined us as our newest Test Director at Google.
By Patrick Copeland

I'm excited to announce that James Whittaker has joined us as our newest Test Director at Google.

James comes to us most recently from Microsoft. He has spent his career focusing on testing, building high quality products, and designing tools and process at the industrial scale. In the not so distant past, he was a professor of computer science at Florida Tech where he taught an entire software testing curriculum and issued computer science degrees with a minor in testing (something we need more schools to do). Following that , he started a consulting practice that spanned 33 countries. Apparently, fashion is not high on his list as he he has collected soccer jerseys from many of these countries and wears those during major tournaments. At Microsoft he wrote a popular blog, and in the near future you can expect him to start contributing here.

He has trained thousands of testers worldwide. He's also written set of books in the How to Break Software series. They have won awards and achieved best seller status. His most recent book is on exploratory testing is coming out this summer. It is not a stretch to say that he is one of the most recognizable names in the industry and has had a deep impact on the field of testing. If you have a chance, strike up a conversation with James about the future of testing. His vision for what we'll be doing and how our profession will change is interesting, compelling and not just a little bit scary.

Join me in welcoming James to Google!

By Simon Stewart

It's a complaint that I've heard too many times to ignore: "My Selenium tests are unstable!" The tests are flaky. Sometimes they work, sometimes they don't. How deeply, deeply frustrating! After the tests have been like this for a while, people start to ignore them, meaning all the hard work and potential value that they could offer a team in catching bugs and regressions is lost. It's a shameful waste, but it doesn't have to be.
By Simon Stewart

It's a complaint that I've heard too many times to ignore: "My Selenium tests are unstable!" The tests are flaky. Sometimes they work, sometimes they don't. How deeply, deeply frustrating! After the tests have been like this for a while, people start to ignore them, meaning all the hard work and potential value that they could offer a team in catching bugs and regressions is lost. It's a shameful waste, but it doesn't have to be.

Firstly, let's state clearly: Selenium is not unstable, and your Selenium tests don't need to be flaky. The same applies for your WebDriver tests.

Of course, this raises the obvious question as to why so many selenium tests fail to do what you intended them to. There are a number of common causes for flaky tests that I've observed, and I'd like to share these with you. If your (least) favourite bugbear isn't here, please tell us about it, and how you would like to approach fixing it, in a comment to this post!

Problem: Poor test isolation.
Example: Tests log in as the same user and make use of the same fixed set of data.
Symptoms: The tests work fine when run alone, but fail "randomly" during the build.
Solution: Isolate resources as much as makes sense. Set up data within the tests to avoid relying on a "set up the datastores" step in your build (possibly using the Builder pattern). You may want to think about setting up a database per developer (or using something like Hypersonic or SQLite as a light-weight, in-memory, private database) If your application requires users to log in, create several user accounts that are reserved for just your tests, and provide a locking mechanism to ensure that only one test at a time is using a particular user account.

Problem: Relying on flaky external services.
Example: Using production backends, or relying on infrastructure outside of your team's control
Symptom: All tests fail due to the same underlying cause.
Solution: Don't rely on external services that your team don't control. This may be easier said than done, because of the risk of blowing out build times and the difficulty of setting up an environment that models reality closely enough to make the tests worthwhile. Sometimes it makes sense to start servers in-process, using something like Jetty in the Java world, or webrick in Ruby.

Watching the tests run is a great way to spot these external services. For example, on one project the tests were periodically timing out, though the content was being served to the browser. Watching the tests run showed the underlying problem: we were serving "fluff" --- extra content from an external service in an iframe. This content was sometimes not loading in time, and even though it wasn't necessary for our tests the fact it hadn't finished loading was causing the problem. The solution was to simply block the unnecessary fluff by modifying the firewall rules on the Continuous Build machine. Suddenly, everything ran that little bit more smoothly!

Another way to minimize the flakiness of these tests is to perform a "health check" before running the tests. Are all the services your tests rely on running properly? Given that end-to-end tests tend to run for a long time, and may place an unusual load on a system, this isn't a fool-proof approach, but it's better to not run the tests at all rather than give a team "false negatives".

Problem: Timeouts are not long enough
Example: You wait 10 seconds for an AJAX request that takes 15 to complete
Symptom: Most of the time the tests run fine, but under load or exceptional circumstances they fail.
Solution: The underlying problem here is that we're attempting to determine how long something that lasts a non-deterministic amount of time will take. It's just not possible to know this in advance. The most sensible thing to do is not to use timeouts. Or rather, do use them, but set them generously and use them in conjunction with a notification from the UI under test that actions have finished so that the test can continue as soon as possible.

For example, it's not hard to change the production code to set a flag on the global Javascript "window" object when an XmlHttpRequest returns, and that could form the basis of a simple latch. Rather than polling the UI, you can then just wait for the flag to be set. Alternatively, if your UI gives an unambiguous "I'm done" signal, poll for that. Frameworks such as Selenium RC and WebDriver provide helper classes that make this significantly easier.

Problem: Timeouts are too long
Example: Waiting for a page to load by polling for a piece of text, only to have the server throw an exception and give a 500 or 404 error and for the text to never appear.
Symptom: Your tests keep timing out, probably taking your Continuous Build with them.
Solution: Don't just poll for your desired end-condition, also think of polling for well-known error conditions. Fail the test with an informative error message when you see the error condition. WebDriver's SlowLoadableComponent has an "isError" method for exactly this reason. You can push the additional checks into a normal Wait for Selenium RC too.

The underlying message: When your tests are flaky, do some root cause analysis to understand why they're flaky. It's very seldom because you're uncovered a bug in the test framework. In order for this sort of analysis and test-stability improvement work to be done effectively, you may well need support and help from your team. If you're working on your own, or in a small team, this may not be too hard. On a large project, it may be harder. I've had some success when a person or two is set aside from delivering functionality to work on making the tests more stable. The short-term pain of not having that extra pair of hands focusing on writing production code is more than made up for by the long-term benefit of a stable and effective suite of end-to-end tests that only fail when there's a real issue to be addressed.