How Google Tests Software - Part Five
Wednesday, March 23, 2011
By James Whittaker
Instead of distinguishing between code, integration and system testing, Google uses the language of small, medium and large tests emphasizing scope over form. Small tests cover small amounts of code and so on. Each of the three engineering roles may execute any of these types of tests and they may be performed as automated or manual tests.
Small Tests are mostly (but not always) automated and exercise the code within a single function or module. They are most likely written by a SWE or an SET and may require mocks and faked environments to run but TEs often pick these tests up when they are trying to diagnose a particular failure. For small tests the focus is on typical functional issues such as data corruption, error conditions and off by one errors. The question a small test attempts to answer is does this code do what it is supposed to do?
Medium Tests can be automated or manual and involve two or more features and specifically cover the interaction between those features. I've heard any number of SETs describe this as "testing a function and its nearest neighbors." SETs drive the development of these tests early in the product cycle as individual features are completed and SWEs are heavily involved in writing, debugging and maintaining the actual tests. If a test fails or breaks, the developer takes care of it autonomously. Later in the development cycle TEs may perform medium tests either manually (in the event the test is difficult or prohibitively expensive to automate) or with automation. The question a medium test attempts to answer is does a set of near neighbor functions interoperate with each other the way they are supposed to?
Large Tests cover three or more (usually more) features and represent real user scenarios to the extent possible. There is some concern with overall integration of the features but large tests tend to be more results driven, i.e., did the software do what the user expects? All three roles are involved in writing large tests and everything from automation to exploratory testing can be the vehicle to accomplish accomplish it. The question a large test attempts to answer is does the product operate the way a user would expect?
The actual language of small, medium and large isn’t important. Call them whatever you want. The important thing is that Google testers share a common language to talk about what is getting tested and how those tests are scoped. When some enterprising testers began talking about a fourth class they dubbed enormous every other tester in the company could imagine a system-wide test covering nearly every feature and that ran for a very long time. No additional explanation was necessary.
The primary driver of what gets tested and how much is a very dynamic process and varies wildly from product to product. Google prefers to release often and leans toward getting a product out to users so we can get feedback and iterate. The general idea is that if we have developed some product or a new feature of an existing product we want to get it out to users as early as possible so they may benefit from it. This requires that we involve users and external developers early in the process so we have a good handle on whether what we are delivering is hitting the mark.
Finally, the mix between automated and manual testing definitely favors the former for all three sizes of tests. If it can be automated and the problem doesn’t require human cleverness and intuition, then it should be automated. Only those problems, in any of the above categories, which specifically require human judgment, such as the beauty of a user interface or whether exposing some piece of data constitutes a privacy concern, should remain in the realm of manual testing.
Having said that, it is important to note that Google performs a great deal of manual testing, both scripted and exploratory, but even this testing is done under the watchful eye of automation. Industry leading recording technology converts manual tests to automated tests to be re-executed build after build to ensure minimal regressions and to keep manual testers always focusing on new issues. We also automate the submission of bug reports and the routing of manual testing tasks. For example, if an automated test breaks, the system determines the last code change that is the most likely culprit, sends email to its authors and files a bug. The ongoing effort to automate to within the “last inch of the human mind” is currently the design spec for the next generation of test engineering tools Google is building.
Those tools will be highlighted in future posts. However, my next target is going to revolve around The Life of an SET. I hope you keep reading.
Instead of distinguishing between code, integration and system testing, Google uses the language of small, medium and large tests emphasizing scope over form. Small tests cover small amounts of code and so on. Each of the three engineering roles may execute any of these types of tests and they may be performed as automated or manual tests.
Small Tests are mostly (but not always) automated and exercise the code within a single function or module. They are most likely written by a SWE or an SET and may require mocks and faked environments to run but TEs often pick these tests up when they are trying to diagnose a particular failure. For small tests the focus is on typical functional issues such as data corruption, error conditions and off by one errors. The question a small test attempts to answer is does this code do what it is supposed to do?
Medium Tests can be automated or manual and involve two or more features and specifically cover the interaction between those features. I've heard any number of SETs describe this as "testing a function and its nearest neighbors." SETs drive the development of these tests early in the product cycle as individual features are completed and SWEs are heavily involved in writing, debugging and maintaining the actual tests. If a test fails or breaks, the developer takes care of it autonomously. Later in the development cycle TEs may perform medium tests either manually (in the event the test is difficult or prohibitively expensive to automate) or with automation. The question a medium test attempts to answer is does a set of near neighbor functions interoperate with each other the way they are supposed to?
Large Tests cover three or more (usually more) features and represent real user scenarios to the extent possible. There is some concern with overall integration of the features but large tests tend to be more results driven, i.e., did the software do what the user expects? All three roles are involved in writing large tests and everything from automation to exploratory testing can be the vehicle to accomplish accomplish it. The question a large test attempts to answer is does the product operate the way a user would expect?
The actual language of small, medium and large isn’t important. Call them whatever you want. The important thing is that Google testers share a common language to talk about what is getting tested and how those tests are scoped. When some enterprising testers began talking about a fourth class they dubbed enormous every other tester in the company could imagine a system-wide test covering nearly every feature and that ran for a very long time. No additional explanation was necessary.
The primary driver of what gets tested and how much is a very dynamic process and varies wildly from product to product. Google prefers to release often and leans toward getting a product out to users so we can get feedback and iterate. The general idea is that if we have developed some product or a new feature of an existing product we want to get it out to users as early as possible so they may benefit from it. This requires that we involve users and external developers early in the process so we have a good handle on whether what we are delivering is hitting the mark.
Finally, the mix between automated and manual testing definitely favors the former for all three sizes of tests. If it can be automated and the problem doesn’t require human cleverness and intuition, then it should be automated. Only those problems, in any of the above categories, which specifically require human judgment, such as the beauty of a user interface or whether exposing some piece of data constitutes a privacy concern, should remain in the realm of manual testing.
Having said that, it is important to note that Google performs a great deal of manual testing, both scripted and exploratory, but even this testing is done under the watchful eye of automation. Industry leading recording technology converts manual tests to automated tests to be re-executed build after build to ensure minimal regressions and to keep manual testers always focusing on new issues. We also automate the submission of bug reports and the routing of manual testing tasks. For example, if an automated test breaks, the system determines the last code change that is the most likely culprit, sends email to its authors and files a bug. The ongoing effort to automate to within the “last inch of the human mind” is currently the design spec for the next generation of test engineering tools Google is building.
Those tools will be highlighted in future posts. However, my next target is going to revolve around The Life of an SET. I hope you keep reading.
Yes we all are reading and enjoying every bit of it. Waiting anxiously for upcoming posts.
ReplyDeleteThough as you mentioned tools will be covered in future posts but still what is "Industry leading recording technology" that "converts manual tests to automated tests"?
This series is fantastic, thank you.
ReplyDeleteYou mention "Industry leading recording technology" ... can you tell us what that is?
Thank you!
Appreciate your constantly sharing us so many valuable information.
ReplyDeleteThis post mentioned that "The ongoing effort to automate to within the “last inch of the human mind” is currently the design spec for the next generation of test engineering tools Google is building." Is it possible to tell more about the test engineering tools in Google in future posts?
Great series of posts! I do a lot of manual testing and would love to hear more about the industry leading recording technology that converts manual tests to automated tests. I've used Selenium for this in the past but would love to explore alternatives. Thanks!
ReplyDeleteI really enjoy these posts to discover what other organizations are doing to improve testing. I was particularly interested in the statement involving "Industry leading recording technology..." What does Google use for this or what frameworks seem to be ahead of the rest? We have attempted multiple packages from commercial to OSS, but all of them seem to fall short on driving the browser and are usually fragile.
ReplyDeleteDoesn't Small-Medium-Large say more about the scope of a test than it does about the type or purpose of the test (e.g. a performance test can be "small" or "large")?
ReplyDeleteAh man, another great post. Looking forward to future posts, especially on SET's & TE's (?).
ReplyDeleteYes all of this will be covered in future posts. Sorry I am slower to get this information out than I would like. This day job thing is really getting in the way of my writing. Our recording technology is called RPF, the Record Playback Framework. It's a Chrome extension that records to Java Script and does some pretty innovative tricks to solve some of the persistent recording issues on the web. If it helps, we do plan on open sourcing all this and working with other browser companies to make it more universal.
ReplyDeleteThanks for the info James. I have an offer for an SET. I already work with as an SDET at a rival company ;). This line in your 3rd post concerns me:"SWEs are testers, SETs are testers and TEs are testers." Do SETs ever get to be devs? More importantly do they get the same level of respect or are they looked down upon as inferior? Say 5-6 yrs down the lane i might want to delve into developing, would that be possible? I am genuinely interested in test but would definitely want the flexibility and environment minus the ego games.
ReplyDeleteI do manual, performance and automation testing working on Nokia projects.
ReplyDeleteFrom my point of view, automation testing is worthed only when it's costs (development and maintenance) are not too big. Also the most important aspect for automation testing should be that the aplication is not always changing (UI and code) so you don't need maintenance all the time for automation tests.
I guess a mix between automation and manual testing will always work in any company.
If anyone is interested in different software topics, visit http://softwaretopics.net - there are some interesting stuff here!
Hey James,
ReplyDeleteReally informative!! What is the career path of an SET at google? Most SETs/SDETs switch to SDEs because they are not quite sure if they can make a rewarding career technically as an SDET. It would be really helpful if as part of your next post you can cover these aspects as well..
Thank you.
I'm looking forward to possibly test-driving the RPF tool. It's so refreshing to hear your posts/seminars/etc emphasize the importance of the human element in testing. Whether it be automated testing or manual testing, the human eye and human ability of common sense should never be removed from quality software assurance. I look forward to your next posts. Thanks again!
ReplyDeleteHow do you test scalability, load etc. at this enormous scale ? Do you simulate everything ?
ReplyDeleteGreat post, also loved to hear that you guys plan on delivering this tool as open sourced.
ReplyDeleteaccomplish repeate
ReplyDelete