Few questions :1) Is google having any manual testing team ? or everything is automated ?If google has manual testing team then is it possible to have one blog on how the manual testing is managed by google ?2) Also how we measure the automation team's performance in google ? More specifically how to count % automation testing done for a project in google
1) Google does have some degree of manual testing, though we try to focus human effort on finding interesting problems with our products. We try to avoid using manual testing as a way to catch regressions or verify basic functionality. You can probably imagine would kind of effort it would take to exhaustively test something like Google Search. For us, more automation is always the goal.2) There are a lot of ways to measure how well automated your testing effort and it varies from team-to-team on which ones are more important. Though I should mention that we also try to avoid large 'test automation teams' - we treat testing as a shared responsibility and wherever possible ask the developers building the products to be involved in developing the associated automated tests. In any case, here are a few ways I think we look at automation performance:* How often are you finding bugs in manual tests if you have them? More bugs during manual testing suggests opportunities for improved automation.* How many defect reports are you getting from the field? Same logic as the above measure.* If you have identified critical use cases, to what degree is each use case covered by automation (unit, integration or functional tests)? Some things make more sense to invest in heavily.* How long does it take to release your product with a full test-cycle? Longer releases are often bottlenecked by slow manual testing, replacing it with automation will allow you to release faster.* How quickly and confidently are you able to make large scale changes? The more fear you have about change, the more likely you need additional automation!
Good information, I have one question, how do you track how much is automated from a critical use case at which test level, and what part is Manual. How the consolidation of results happen to gain confidence?
First, we expect all code to be thoroughly unit tested so hopefully when it comes to small, fast, tests we can assume that individual code paths have been covered completely. We can use coverage-analysis to verify whether or not that is true. Unfortunately, at this level there is a little 'forest-from-the-trees' problem as many lines of code will be part of more than one feature. We don't usually make an effort to connect unit tests to critical feature coverage.When tests get larger and more distinct we have some internal test tracking tools that we can use to define user-level behaviors and then indicate tested or not and automated or not.When it comes to confidence, teams I have worked with tend to focus on things like defect rate, release failure rate and release time as high level metrics indicating overall code health.
> Be prepared to allocate at least one week a quarter per test to keep your end-to-end tests stableCould you please elaborate on this? If it takes a month of person-time per year per test, do you simply have very few (but very large?) E2E tests?
First, consider that at our scale, Google's products are the result of many, many services working together. Keeping an E2E test stable when it involves 50+ service dependencies, many of which are being modified on a daily basis, is quite a task. It can take real engineering effort and often management support to ensure everyone responsible for those services is committed to enabling automated testing. Even if everyone involved is aiming for the same goal there is still going to be time spent making sure all this stuff actually works.As a result we do often recommend that teams employ fully E2E tests sparingly. We encourage teams to use unit and smaller integration testing much more heavily than E2E testing. As the number of systems involved in a test goes down they are faster, more reliable, easier to debug and less costly to maintain. If your products have fewer, or slower changing dependencies you may find that a week per-quarter is too high. However, a complex test is still software and will require maintenance effort proportional to its complexity.
Good info.When do you run your E2E tests? is it triggered by a deployment or scheduled runs?
Each team will vary a little bit here based on the runtime of the test. Ideally, we would like to run E2E tests before every commit. However, because E2E tests are often quite slow we have to run them at a slightly lower frequency. Many teams will run these tests triggered by a commit, but after the code lands in the repo, waiting no more than a couple hours to trigger a run. Other teams will run a scheduled job every N hours to run all the known E2E tests. We tend to decouple the testing from the deployment and instead wait for a signal from our continuous integration tool letting us know that a particular commit is safe for deployment because all applicable tests have been run against it.
Could you provide more information about "documenting common test failure modes". Please, give some examples.
When a unit test you can look at the failure and often hop right to the exact line of code causing the problem. Unfortunately, in an E2E test there are often multiple things conspiring to fail your test, so such precision in fault-finding is usually unachievable. We often encourage those who write the tests to include human readable documentation, either in test comments or on the internal team wiki documenting known reasons for a test to fail. For example, in the system outlined above we are dependent on an authentication system to complete our test. In some cases the authentication system may report an error authenticating a user because of a timeout in one of its upstream dependencies. If this happens more than once the engineer working on the test might write something like this on the internal team page:"If you see error logs indicating a timeout in the auth system, you can assume this was down to quota issues in bigtable storage."From then on, any one tasked with fixing the tests can refer to this note when attempting to diagnose a failure.
Small typo above - should read "When a unit test fails you can look..."
The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.