Google Testing Blog: Just Say No to More End-to-End Tests

Just Say No to More End-to-End Tests

Wednesday, April 22, 2015

by Mike Wacker End-to-End Tests in Theory ten things we know to be true

Developers like it because it offloads most, if not all, of the testing to others.

Managers and decision-makers like it because tests that simulate real user scenarios can help them easily determine how a failing test would impact the user.

Testers like it because they often worry about missing a bug or writing a test that does not verify real-world behavior; writing tests from the user's perspective often avoids both problems and gives the tester a greater sense of accomplishment.

End-to-End Tests in Practice

The latest version of the service is built.

This version is then deployed to the team's testing environment.

All end-to-end tests then run against this testing environment.

An email report summarizing the test results is sent to the team.

Days Left

Pass %

Notes

Everything is broken! Signing in to the service is broken. Almost all tests sign in a user, so almost all tests failed.

A partner team we rely on deployed a bad build to their testing environment yesterday.

-1

54%

A dev broke the save scenario yesterday (or the day before?). Half the tests save a document at some point in time. Devs spent most of the day determining if it's a frontend bug or a backend bug.

-2

54%

It's a frontend bug, devs spent half of today figuring out where.

-3

54%

A bad fix was checked in yesterday. The mistake was pretty easy to spot, though, and a correct fix was checked in today.

-4

Hardware failures occurred in the lab for our testing environment.

-5

84%

Many small bugs hiding behind the big bugs (e.g., sign-in broken, save broken). Still working on the small bugs.

-6

87%

We should be above 90%, but are not for some reason.

-7

89.54%

(Rounds up to 90%, close enough.) No fixes were checked in yesterday, so the tests must have been flaky yesterday.

Analysis What Went Well

Customer-impacting bugs were identified and fixed before they reached the customer.

What Went Wrong

The team completed their coding milestone a week late (and worked a lot of overtime).

Finding the root cause for a failing end-to-end test is painful and can take a long time.

Partner failures and lab failures ruined the test results on multiple days.

Many smaller bugs were hidden behind bigger bugs.

End-to-end tests were flaky at times.

Developers had to wait until the following day to know if a fix worked or not.

The True Value of Tests A failing test does not directly benefit the user. A bug fix directly benefits the user.

Stage

Failing Test

Bug Opened

Bug Fixed

Value Added

Yes

Building the Right Feedback Loop

It's fast. No developer wants to wait hours or days to find out if their change works. Sometimes the change does not work - nobody is perfect - and the feedback loop needs to run multiple times. A faster feedback loop leads to faster fixes. If the loop is fast enough, developers may even run tests before checking in a change.

It's reliable. No developer wants to spend hours debugging a test, only to find out it was a flaky test. Flaky tests reduce the developer's trust in the test, and as a result flaky tests are often ignored, even when they find real product issues.

It isolates failures. To fix a bug, developers need to find the specific lines of code causing the bug. When a product contains millions of lines of codes, and the bug could be anywhere, it's like trying to find a needle in a haystack.

Think Smaller, Not Larger Unit Tests

Unit tests are fast. We only need to build a small unit to test it, and the tests also tend to be rather small. In fact, one tenth of a second is considered slow for unit tests.

Unit tests are reliable. Simple systems and small units in general tend to suffer much less from flakiness. Furthermore, best practices for unit testing - in particular practices related to hermetic tests - will remove flakiness entirely.

Unit tests isolate failures. Even if a product contains millions of lines of code, if a unit test fails, you only need to search that small unit under test to find the bug.

buildstests Unit Tests vs. End-to-End Tests

Unit

End-toEnd

Fast

Reliable

Isolates Failures

Simulates a Real User

Integration Tests Testing Pyramidtesting pyramid2014 Google Test Automation Conference

Inverted pyramid/ice cream cone. The team relies primarily on end-to-end tests, using few integration tests and even fewer unit tests.

Hourglass. The team starts with a lot of unit tests, then uses end-to-end tests where integration tests should be used. The hourglass has many unit tests at the bottom and many end-to-end tests at the top, but few integration tests in the middle.

84 comments :

UnknownApril 23, 2015 at 2:45:00 AM PDT
Meanwhile I share the opinion, I have problem with measuring the shape - just for curiosity, how you suggest to measure the size of unit/integration/E2E tests?
Comparing the coverage they have, a few E2E test can generate much higher coverage than several unit tests. Comparing numbers, and having n thousands of unit tests and having only <100 E2E tests, this would still be presented as pyramid (well in the given percentages), but the E2E part still may cause so many problems (time, effort, test env problems and value of the test), that we can say: we have the pyramid - but the goal is not achieved.
ReplyDelete
Replies
UnknownApril 23, 2015 at 4:51:00 AM PDT
I agree with the main idea, but it's nothing new. Let's look at V-model in testing.
I would add one thing: Before unit test it would be nice to perform a code deskcheck - statis testing - the first step in testing chain.
ReplyDelete
Replies
Random KernelApril 23, 2015 at 5:18:00 AM PDT
Hey Mike,

Thanks for the article. I think that sentence is good to be highlighted: "The exact mix will be different for each team, but in general, it should retain that pyramid shape."

A typical path for a test automation engineer is the following: 1) we do everything as close as possible to the real user's experience; 2) oh, well, those tests are too slow and unstable; 3) let's move to unit tests; 4) oh, well, unit tests are good and green, but we do miss some important bugs here; 4) both unit and end-to-end tests are important. I don't mention integration tests here, since it's a too general term, and they may differ in size and value even within one project, not to say about different projects and teams.

Also, sometimes end-to-end tests are built upon API tests that may be considered as unit tests in some extent. So when we talk about percentage, we should take it into account, as well.

With all that in mind, here is my point: yes, the pyramid makes sense, but don't pay too much attention to 70/20/10 or anything like that. Think in term of _your_ product, its specific, its challenges, and build your strategy and tactics on that.
ReplyDelete
Replies
AnonymousApril 23, 2015 at 11:24:00 AM PDT
Yes, we still do. If you're trying to sell the testing pyramid to someone, using small/medium/large instead of unit/integration/E2E may make it an easier sell.
ReplyDelete
Replies
UnknownApril 24, 2015 at 11:34:00 PM PDT
You should mention FIRST properties of unit tests.
FIRST should be applied to all tests much as possible, but bigger the scope, the harder it gets.
ReplyDelete
Replies
Ran TavoryApril 25, 2015 at 10:47:00 PM PDT
Tests, as well as monitoring of all sorts (app level monitoring, host, user, kpi) are all part of the immune system of your software IMO .
I agree with the 70/20/10 approach but on top of the pyramid I would add another pyramid of monitoring. I argue that well thought out monitoring is more effective than tests in many cases, particularly in CD (continuous deployment) where MTTR (mean time to recovery) is far more important than MTBF (mean time bw failures)
I'd go with 50/50 bw testing and monitoring time investment wise, at least in CD scenario.

BTW having to wait for tests (any test) to run at night doesn't make sense in many cases anyway, CD included.
ReplyDelete
Replies
Ran TavoryApril 26, 2015 at 12:24:00 AM PDT
Coming from the CD (Continuous Deployment) perspective, I think things are a little different.
With CD the complete "immune system" means that monitoring (different types of monitors) are part of the immune system aside tests and they complement the tests (other components in the immune system are code review, static code analysis etc).
Interestingly, monitoring resembles testing in many ways, so you'd have application level monitoring, which usually are similar in scope to unit tests - they usually monitor individual in-process complements (e.g. size of internal memory buffer, operations/sec etc), you have host level monitoring (CPU, disk etc), which is similar in concept to integration tests and you have KPI monitoring (e.g. # daily active users etc) which takes the user perspective and is similar to E2E tests.
The picture would not be whole if you don't mention monitoring since, IMO monitoring come on the expense of testing - developers either invest time in tests or in monitoring (or split their efforts b/w these two)
I would argue that, at least in CD where MTTR (Mean Time to Recovery) is far more important than MTBF (Mean Time Between Failures), monitoring take precedence over tests. I would draw yet another pyramid - a monitoring pyramid - on top of the testing pyramid such that 70% is application level monitoring, 20% host monitoring and 10% KPI. And the entire effort b/w tests and monitoring should be split 50/50 (or some other number that makes sense for your use case - in some cases it's 90/10).
Again, I'm speaking from the perspective of CD - which may or may not apply to some google systems, but many dev organizations tend to like it.
BTW speaking about putting the user in the center, delivering value fast and being able to verify the value with actual users in matter of hours - the core value of CD - fast feedback (including the user in the feedback loop) - *is* putting the user in the center.

BTW2, a feedback loop needs to be in the order of a few hours at most (minutes sometimes), *including actual users* in the loop, not just automated tests. As such - running E2E tests during the night simply makes no sense.
ReplyDelete
Replies
오장일April 27, 2015 at 1:32:00 AM PDT
Hello
I would like to translate the contents of this blog in Korean on My Blog.
is it possible?

Have a good day.
ReplyDelete
Replies
RecurrenceApril 27, 2015 at 1:50:00 PM PDT
Sounds like Test Instability and Timeliness are your biggest beefs (Addresses basically everything in 'What Went Wrong')

Just throw a thousand instances at the problem and have your results in (overhead + longest_test time). I've done something similar but with only 300 instances some years ago and we had E2E results in 12 minutes after EVERY commit.

Benefits:
+ You can isolate the test (General cause of instability)
+ Results are quick and can be traced to a specific commit
+ Comparatively little waiting period for results

That said, if your labs can't keep themselves up, you have no business in the E2E testing space.
ReplyDelete
Replies
Jonty...April 28, 2015 at 2:35:00 AM PDT
Nice article Mike -

While I agree in principle of IT and Software delivery, am not sure if am board with this statement : "Although end-to-end tests do a better job of simulating real user scenarios, this advantage quickly becomes outweighed by all the disadvantages of the end-to-end feedback loop"

Have we reached a maturity level, where in software building process has become so much standarized and defects more predictable ?

I would argue there are whole lot of systems which still puts user feedback, by simulating end user flows high on the pedestal than faster feedback.

One key reason for E2E tests is because simulating all aspects of user behavior (the fundamental reason of the application) is too tedious at units level

It is great to see most org adopting matured and faster dev practices, but jumping into it without setting the house right for me is the biggest risk :)

I however subscribe to your thought, building layered Architecture is the need of the hour :)
ReplyDelete
Replies
UnknownApril 29, 2015 at 8:03:00 AM PDT
Thanks for sharing your experience.

I'm new to TDD. I'm reading "Growing Object-Oriented Software, Guided by Tests" by Steve Freeman. The author has very interesting argument for end-to-end tests:

"Running end-to-end tests tells us about the external quality of our system, and writing them tells us something about how well we (the whole team) understand the domain, but end-to-end tests don’t tell us how well we’ve written the code. Writing unit tests gives us a lot of feedback about the quality of our code, and running them tells us that we haven’t broken any classes—but, again, unit tests don’t give us enough confidence that the system as a whole works."

So I understand this statement as end-to-end tests give us feedback and tells whether we are moving in the right direction. After reading your post I got feeling that end-to-end tests are a waste of time. Don't you think they play vital role in the early stage of development?
ReplyDelete
Replies
AnandMay 3, 2015 at 7:01:00 AM PDT
Got your point. I think for finance domain application stakeholders give more importance to E2E automated tests more as they want to ensure End User Experience or Customer Journeys meet the expected behavior. These tests not necessarily serve the purpose when designed badly and generally concentrate on proving something works. They are under a wrong pre-text that you can replace manual tests with these E2E tests.
ReplyDelete
Replies
Federico RampazzoMay 5, 2015 at 9:36:00 AM PDT
I feel like the title is misleading. I disagree with the title but I agree completely on the article
E2E tests are important - but you can't rely ONLY on them.
E2E tests are good for quality assurance, unit and integration tests are an aid to developers.
ReplyDelete
Replies
Gaurav JoshiMay 6, 2015 at 3:29:00 AM PDT
What are your suggestions for legacy systems? Benefits of automated end to end tests are much larger than unit testing or acceptance testing. For new functional development, I completely agree with your approach.

At the moment, we are concentrating automating end to end regression manual tests to cut down our release cycle. We plan to add integration/unit testing to identified problem area. Could you suggest alternative approach.
ReplyDelete
Replies
Adoniram MishraMay 6, 2015 at 8:41:00 AM PDT
I too am a believer of a test pyramid and just to add I believe in not repeating the test i.e if something can be tested at a lower level, push it to the lower level and try not to have the same validation at higher level. Also, We should aim for ~100% unit test code coverage as unit tests are first and most strongest line of defence.
ReplyDelete
Replies
Shadab AnsariMay 7, 2015 at 8:41:00 AM PDT
Can I post this article in my blog by giving you due credit? It is really an eye opener for QA managers.
ReplyDelete
Replies
Gaurav JoshiMay 7, 2015 at 10:03:00 PM PDT
What are your thoughts on acceptance testing ? They are E2E in nature
ReplyDelete
Replies
Michael McDermottMay 11, 2015 at 7:23:00 AM PDT
Do you think working in a dynamically typed language (such as Python or Ruby) changes the arguments here in some way?
ReplyDelete
Replies
itinsleyMay 11, 2015 at 7:52:00 PM PDT
I think a lot of posters are ignoring the importance of letting your tests drive your design. Thinking about how you are going to test your code encourages you to design good abstractions in your classes and services and should allow you to test business processes at the unit or integration level. When the tests exist with a close relation to the function or process the tests are likely to stay relevant and up-to-date. Having worked on a team that had extensive (many 1000's) of Cucumber E2E tests we ended up in a situation where engineers were maintaining tests while being unsure if the tests were still actually relevant or simply legacy remains. Because they are E2E by definition it is hard to define ownership of these tests in relation to any particular codebase, library or service and they end up as poorly maintained 'common' code with no individual feeling they have the right to delete them. Inevitably the tests continue to grow and build times get out of hand. If you are doing TDD using E2E tests the results can be disastrous with logic scattered around all over the code base.

By all means, have E2E tests but keep them broad and shallow - i.e. the 10% described in the article.
ReplyDelete
Replies
UnknownJune 12, 2015 at 4:09:00 PM PDT
How about this? Stop blaming on the E2E Test Methodology, but blame on you, the developers for now doing a good job. I think developers are not capable if they break 10 things to fix 1 thing. Coming from defense background, I see that developers of the web technology don't seriously take responsibilities and accountability. If you are likely to break features that already worked from before, maybe it's time that go back to school.
ReplyDelete
Replies
OZJuly 21, 2015 at 7:34:00 PM PDT
I can't believe it was written by Google engineer... It's like promoting of approach "my module works" - when you use unit-tests only, everything can work fine but not in collaboration. How it's not obvious for that Google engineer? I'm sure it even sounds offensive for a lot of Google engineers, especially for those who work on e2e-testing tools. See how many downloads one of them have in npm: https://www.npmjs.com/package/protractor
It's the most awful article what can be found in this blog.
ReplyDelete
Replies
Ran DavidovitzAugust 2, 2015 at 1:12:00 PM PDT
Couldn't agree more, and i would even AIM higher than 70% UT code, all those E2E testing are killikng organization in over complicated and failing tests.
The E2E test should do a flow of UI to see items are connected and not broken
And no matter what, Keep the E2E code in the team's repo and not external repo!
ReplyDelete
Replies
UnknownSeptember 9, 2015 at 9:07:00 AM PDT
What about refactoring? Isn't harder to continuously evolve an OOD when every class has a corresponding unit test? (every time you throw a class away you throw away the corresponding unit test and then write new tests for the new replacement class(es)).

With end to end tests, the core design of a software can be refactored (as often as it takes) without the need to refactor the tests (if the user facing API is the same).

Not to say that end to end tests are better than unit tests but I think that refactoring is a very frequent activity in agile software development and should be taken into account when comparing different testing approaches.
ReplyDelete
Replies
Mark MaxeyOctober 17, 2015 at 12:47:00 PM PDT
I agree with the pyramid in theory, but not in not always in practice. When working on large legacy systems with no automated tests, I recommend inverting the pyramid. No one has budget or time to backfill unit tests. Transforming manual testing organizations means taking what they have and improving it incrementally. E2E automation and later integration shows fast ROI for management to fund unit automation for new and modified features.
ReplyDelete
Replies
Mark MaxeyNovember 27, 2015 at 5:03:00 PM PST
The systems I work on are multi-million line code bases that have existed for decades. These systems have static well-defined interfaces with other systems in our enterprise. This means that we can write E2E tests against the interfaces without being coupled with the implementation (good or bad).

With such large code bases, hundreds of developers who don't understand anything about API design or automated unit tests, and a fixed budget, schedule, and features, using the pyramid as recommended requires years of multi-discipline cultural change. I believe that change starts with writing E2E tests simulating externals and asserting results is a cheap way to verify the basic functionality of the systems from the perspective of the externals. One or two people is all you need to start the revolution and to show immediate ROI in terms management cares about (externally visible system function). Extending the revolution to unit tests as recommended is a huge investment for behavior below management's radar - a very hard sell.
ReplyDelete
Replies
Juan MendesMarch 4, 2016 at 1:58:00 PM PST
My gist of the pyramid is: Do not try to cover edge case tests in end-to-end tests. For example, client side validation, grids without data, DB down, network out. Ideally you can test edge cases on the unit level. When you can't, you may end up with an extra end-to-end test. However, for every feature, there should be a few cases where you mimic the user in a typical usage scenario which makes sure the that unit tested parts to work when they come together.
ReplyDelete
Replies
UnknownMarch 9, 2016 at 11:20:00 AM PST
I'm not 100% convinced.

If a developer introduces a bug in the login or save functionality, I definitely want most of the end-to-end tests to fail. Something is very very wrong!

But. There definitely needs to be a detailed suite of unit tests in existence around logging in and saving! So the bug should also break at least one or two unit tests.

So: focus on the unit tests first.

If you have a *lot* of e-2-e test failing and no corresponding unit-test failing, the problem is probably that you are missing some unit tests. If possible write one or more unit test that captures the issue. Code coverage tools can help a bit. More often than not, after adding missing tests (which should initially fail, since they are meant to capture a bug that only surfaced in e-2-e) and then fixing the unit test failures, the majority or all of e-2-e tests will pass again. There are obviously e-2-e test failures that cannot be found in a unit test environment. When that's the case you definitely want the failing e-2-e test in your suite!

Also, the idea of shipping if say 90% of e-2-e tests pass, sounds ludicrous. If the failing tests are out-of-scope, take them out, or replace them with something that passes. Shipping with "10%" e-2-e test breakage means you don't have a good mental model of what you're shipping. So throw away the offending tests if you need to, but for every test you throw away, you should be able to determine whether it means that you are ditching some features, or need to prevent some edge cases, or that the tests were not (or no longer) valid.

Automated e-2-e tests are a great thing. You don't necessarily have to apply them to every build if that slows you down. They are definitely more brittle than unit tests. That's because there are a lot more moving parts in a e-2-e test than in a unit test! Same as real life :-)

Good e-2-e tests can protect against tricky regressions, where a lot of moving parts are involved.

Also, in your scenario of doom, you have a list of things that happen, and completely derail your planned release.

That the release gets derailed is a GOOD thing. I don't want to ship code if it was tested against moving targets / instable environments.

I definitely want to delay the release of a developer bungled the login functionality/

Those are all valid reasons for stopping the show.

The e-2-e tests that stop the show when things like that happen are your lifeline :-)

ReplyDelete
Replies
UnknownMarch 16, 2016 at 5:09:00 PM PDT
This comment has been removed by the author.
ReplyDelete
Replies
UnknownMarch 17, 2016 at 8:02:00 AM PDT
Hmm. This article seems to be implying that e2e tests will cause your development cycle to explode unless you abandon them in favor of faster unit and integration tests. I'm all down for smaller tests but I don't think having an e2e suite is going to kill you. It just shouldn't be the only line of defense you have.

In the fake scenario the devs lost over three days because they were apparently helpless to see if the changes they made to the code were good or not until after they got the results back from the e2e test suite. Most devs I know have some kind of Docker or Vagrant sandbox where they can see their change in action and can run at least some kind manual testing right at their desk. This doesn't catch everything but it would mean the three days "wasted" because they didn't know their fix was bad is a little out of bounds. I also think the day lost to hardware failure in the test lab is exaggerated too. That maybe happens once every few years unless you have the most crappy and complicated test setup in the world.

Other than flaky tests, it seems that all the issues in this article are less from having too many e2e tests and more from not having enough unit tests or a proper development environment. It's true that devs will still need to wait for their code to be deployed until after all the e2e tests have finished (and passed) but that doesn't mean developers can't get feedback from other sources before that and fix issues they find. Also, adding a little logging to your e2e tests makes it a billion time easier to track down why a test failed. Just sayin.
ReplyDelete
Replies
UnknownMarch 25, 2016 at 5:44:00 AM PDT
What is your recommendation when using an Agile approach? In Agile, testing is unit by unit. How do we test the whole flow in a large project? Using unit testing wont let us know if things will work when everything is completed.
ReplyDelete
Replies
UnknownApril 1, 2016 at 2:39:00 AM PDT
Article assumes a lot of things about the way development is done and does have valid points on true agile development/testing organization, but this is not the case in many organizations around the world.

Google does great products and can be seen as one of trend definers in software development, but still world is not just around google or other similar hi-tech companies and I hope that no one takes the views in this article as a single truth of how development/testing is or should be done...instead, it provides very narrow and limited view!

I had a privilege as a consultant to witness the variety of different type of development organizations. Why things were done in a certain way was in many case due to the nature of the developed application or because of the history (15 years ago it was not so mandatory to create unit tests and a lot of products with this burden still exists) and in many cases the challenge was not in the feedback cycle and e2e tests were extremely valuable.
ReplyDelete
Replies
AnonymousApril 14, 2016 at 12:37:00 PM PDT
I think that this article has many valid points but some invalid ones. It treats E2E as evil and a avoidable tasks. In my experience all tests are important in their timeframe in the development process: unit testing when writing the software, integration testing when the feature is ready and can be integrated with other components, and E2E testing. E2E testing is very useful to detect those intangible bugs, components alone can work perfectly (and thats what unit testing help to accomplish) but once they are delivered the workflow of an application can be incomplete, not user friendly, or simply be wrong.
ReplyDelete
Replies
AKASH DAS @feelingsMay 3, 2016 at 3:57:00 AM PDT
Hey Mike I am working for Target and I am busy nowadays convincing my leadership that we should bring API testing in place specially for products where the UI is evolving and the UI is not stable.
As we are centralized testing team and there are some other module specific testing teams also.
There are two questions from Leadership :

1) Is the bug detection count going to increase as result of API testing.
2) If the module teams are doing the API testing then when centralized testing team check the flow from startpoint to end point; how will that differentiate us from them in terms of testing differently and value addition.

As per you what are the answers for them .

Thanks in Advance.
ReplyDelete
Replies
Irtiza_RizviMay 5, 2016 at 8:05:00 AM PDT
Never say NO to more e2e tests. Everyone agrees we need e2e tests because unit tests & integrations tests are not reliable. THEY MISS BUGS which seriously impact the user. Why put so much time in them when we can put equal time creating effective e2e tests which WILL CATCH bugs. This discussion will always continue when we allow developers to write/discuss about Testing. Developers only look for themselves when discussing how bad e2e tests are.
ReplyDelete
Replies
axel22May 13, 2016 at 2:26:00 AM PDT
What went wrong:
- The team did not have a hermetic environment for their integration tests.
- The team did not run their integration tests _before_ merging in their changesets.
- The team did not remember that they can actually _revert_ a changeset that broke the tests.
- The team failed to realize that debugging failed integration functionality takes even more time than debugging a test scenario (debugging sucks, testing rocks, remember? ;) ).
- The team failed to write sufficiently many unit tests _in addition_ to end-to-end tests.
- The team was using flaky end-to-end tests.
- The team was using end-to-end tests that took too long.
ReplyDelete
Replies
axel22May 13, 2016 at 2:36:00 AM PDT
I read the whole blog post now, and although the title sounds provocative, a colleague of mine pointed out the "more" keyword in the title. I agree that there should be a balance between unit tests and e2e tests, but solid e2e tests must still exist.
ReplyDelete
Replies
UnknownMay 18, 2016 at 11:29:00 PM PDT
Unfortunately the real world isn'nt that easy. The possible number of tests increases when I combine units to modules and modules to applications. And most often applications have interfaces to other applications so the number of possible tests increases again. I agree that all types of tests in the pyramid are necessary. But it is not possible to give a rate like 70/20710 in general. Some people state I have 75% test coverage for example. If you ask them how they measure this coverage then then refer to executed lines of code. But in reality their test coverage is much smaller as the complexity increases with integration. So the art is to find the right unit tests, the right integration tests and the right e2e tests. You will always have to apply a risk based approach to find the right tests.
ReplyDelete
Replies
GeneMay 31, 2016 at 7:41:00 AM PDT
Absolutely misleading and damaging title !
You should name it differently... "E2E coverage VS velocity" or "E2E trade offs" or something like that.

Article itself is a collection of materials from other blogs and articles ?

If you have problem with execution speed, there are tons of ways how speed them up:
- parallelize your tests
- manage them properly with suites ( execute only tests that touches the area, which has been affected by the change).
- use "hybrid" test framework approach. For example: use API calls for the test preparation, instead of doing it via UI.

If you have "flaky" tests, then, 95% of the time, it is lack of tester's skills on how to design robust tests.

UI(E2E) tests are as useful as any other tests if done properly, and must be used along with unit and API tests in the right proportion and preferably in "hybrid" framework.
ReplyDelete
Replies
SanjayJune 28, 2016 at 6:00:00 AM PDT
Nice article to have understanding of testing pyramid. Regrading junit v/s integration test, I am really confused about having a worth of integration test. As with junit you are going to test only one unit at a time and second unit will fully mocked for all its behavior. Now when I mocked all the behavior of second unit for first unit, creating a integration test will not make difference communication between two objects are already test by mocking all scenarios. So in that case should we really opt for integration test ?

Thanks,
ReplyDelete
Replies
UnknownJune 29, 2016 at 4:42:00 PM PDT
I think this article really deals with larger, enterprise projects. Smaller projects, particularly those with a great deal of success hinging on user interactions, benefit greatly from end-to-end interface driven testing. I can see how in a larger project they may lose value in many scenarios.
ReplyDelete
Replies
srehmanJuly 26, 2016 at 8:05:00 AM PDT
Its really hard to release product without end to end tests when code base is complex. Unit and integration tests are great place to start but E2E tests combined all those different components and make test them. We find more tests in E2E than unit and integration tests, imagine a phone release without E2E testing how well it will work? There are lots of ways of speeding up testing cycle and better designed tests can run in minutes rather than hours or days. Perfect strategy is Unit and integration tests gating the master branch where nightly automated system tests kick in and find rest of issues..
ReplyDelete
Replies
srehmanJuly 26, 2016 at 9:54:00 AM PDT
Doesn't fit for every product as the products which end up in real user and have are complex need to be tested at E2E. There is lots of overhead in maintaining integration tests, on the other hand with good automation framework E2E tests can be very simple to add but test coverage can be great. E2E testing taking long time is not excuse not to do it as it can be speed up to matter of hours instead of days or even minutes in some cases. A right balance between pyramid testing is very much needed, I guess pyramid structure is ideal for small projects but not for very complex software.
ReplyDelete
Replies
UnknownNovember 14, 2016 at 12:18:00 PM PST
One thing end to end tests can do is to help your manual testers identify areas that need their eyes. It's very true E2E can be flakey and can be slow. Rather than having those tests hold up CI or cause a fire to fight in dev land, use them as a supplemental testing tool. Data for your testers to test better or areas to hit hard before release.

Remove the focus on the machine finding bugs and instead use it as a tool for the only users you have internally, your manual testers.

That said I do completely agree with the pyramid approach. Just some extra food for thought on how to deal with e2e test results.
ReplyDelete
Replies
UnknownFebruary 19, 2017 at 7:57:00 AM PST
Is the testing pyramid a good strategy for testing?

Every time I read about or discuss this thing, it seems to fuel more confusion, not less.

For example, why do the labels here differ from the original? Are unit, integration and e2e all the same classification?

Shouldn't we be thinking about testing in a pipeline instead?
ReplyDelete
Replies
Louis CADFebruary 26, 2017 at 10:18:00 AM PST
Meanwhile, there's no performance tests for Google Docs, as scrolling performance is horribly slow (I use it on a MacBook Air 2013, 8GB of RAM, Intel Core i5). Same for the Google Play Games app, but worse (about 2fps when scrolling as images are processed on the UI/rendering thread)
ReplyDelete
Replies
Blogger84May 23, 2017 at 3:31:00 AM PDT
I agree with most of the arguments but there is a another point of view. If we treat E2E as pure functional tests, they give invaluable quality confidence before pushing stuff to QA environment. QA team can have their own set of cases but since you have already worn the hat of the QA guy, you are less likely to face bugs which can not be caught in 'Partial' Integration Tests that you mentioned or in unit tests. Note that i am still for extensive unit test coverage but not so much for integration tests. So basically, hour glass shape is not as bad in some cases.
ReplyDelete
Replies
gleb bahmutovMay 30, 2017 at 7:31:00 AM PDT
Things really change if the economics of tests change. What if E2E tests were as quick and simple to run as unit tests? Then the entire pyramid would flip! I call such changed pyramid "testing trapezoid" https://glebbahmutov.com/blog/testing-trapezoid/

At Cypress.io (which I have joined recently) we are working hard to make web browser tests fast, reliable and repeatable. For us, it makes sense to write more E2E during development, because ultimately they reflect user's behavior better.
ReplyDelete
Replies
UnknownJuly 14, 2017 at 2:57:00 AM PDT
I know this blog is a few years old, so I'm wondering if you changed your stance on this at all?

From reading the post, it looks like there are (or were) other big problems that the post doesnt explicitly recognise:
- introducing broken code the day before release date
- blocked testing effort due to the bug being called a "failed test"
- there appears to be an acceptance of "flaky tests" being ok?
- the "automation triangle" has been mislabelled as a "testing triangle", but doesn't represent the full picture of testing (i.e. doesn't include investigative/exploratory testing at all). In fact, the whole post only talks about automation, which can only assert an explicit expectation. What about the rest of the testing activities that focus on exploration and investigation?
- The only types of risks being recognised here appear to be "integration" risks. No other types of risks that should be tested for are mentioned here.

I wonder if these problems have been picked up and resolved within google since this blog was written?

ReplyDelete
Replies
FrancoisAugust 3, 2017 at 6:00:00 AM PDT
The whole end to end testing description fail to explain the full picture and downgrade the need for end-to-end testing?

I disagree here. What is failed to describe is how all different ind of testing fit like a lego blocks into one another and not just a pyramid of start with unit tests and work oneself up to lesser set of E2E testing.

One people need is a traceability matrix, showing first foremostly the functional and bon-functional requirements. For each of those linked this to all the unit tests - in a matrix per system/sub-system/sub-component.

Then for each of these there are integration testing, service level testing.

Both unit test and service testing can be automated. Most cases these are semi-automated depends upon the complexity of the content and user data available.

Now end to end testing make sure the integration tests are working.
This is the first step in true integration testing. Not merely a service test but test that the integration of all the applications and components integrate correctly.

Only here after the full end to end testing commence.

So you really do four more than just three layers.
1. Unit Tests
2. Service Tests
3. Integration Tests
4. End to End Functional Testing
5. Regression Tests
6. Performance Tests
7. Automated Tests
and so on.....
5. End to End Non-Functional Testing

So I totally disagree with the article that done away with end to end testing or making it less. Also the percentages proportionally are incorrect. Merely because one can for each unit test have a like for like functional end to end test and/or non-functional end to end test.

So E2E testing is definitely not a risk it on the contrary minimize risk for implementation not met the requirements.

ReplyDelete
Replies
UnknownAugust 31, 2017 at 3:51:00 PM PDT
How do you make sure you are not doing a replication of efforst when integration and unit testing are done?
ReplyDelete
Replies
Dave SchinkelOctober 6, 2017 at 1:35:00 PM PDT
You totally failed to mention TDD and design feedback.
ReplyDelete
Replies
RevathiMay 10, 2018 at 3:56:00 AM PDT
Hi,

I am tasked to create an automated tool for android system events. Can you advise me which automated testing tool can I use to create testing/generating system events? Can Robotium, appium or Espresso be used? In my understanding robotium and appium is useful for UI testing but can we use that for system event testing?
ReplyDelete
Replies
UnknownSeptember 24, 2018 at 9:55:00 PM PDT
Hi Mike,

Here is my interpretation of Test Pyramid covering all aspects of testing from risk-perspective.
https://amtoya.com/blogs/test-pyramid-as-a-risk-filter/

Regards,
Amit
ReplyDelete
Replies
DherikMarch 8, 2019 at 5:58:00 AM PST
I worked on an application that have more than 12 hours of end-to-end tests (that we later managed to distribute the test on different machines and reduce the time, but this is another story). I can only agree with the author.

Even being a monolith application (what it was easier to put up and running to test) it was nightmare to maintain the tests. Most part of the time we was maintaining the tests instead of catching bugs. Discover the origin of a bug on a end-to-end test takes a lot of time. We also dealt with a lot of "false-negative" tests and few time to understand the problem and correct it: Java Applet loading problems, expected element not found on the page (plus other problems about the speed automation), maintain query code that are just used on the database memory test (because the original use database specific code), etc.
ReplyDelete
Replies
KenowiJune 28, 2019 at 3:35:00 AM PDT
In an ideal world I would agree with the pyramid of testing as proposed by Google long time ago, but most companies do not see themselves as 'software' companies like google. They should, but they don't. That brings you to the question, if you would have limited time/budget would you prefer unit tests or e2e-tests? For the first one you need developers and finally you do not know if your application works, for the second you can have non-developers maintain them and you actually know that your main features work. So it's all about taking a risk on how long things will need to be maintained. E2Etests is product insurance, Unit tests is maintenance insurance. Short vs long-term vision.
ReplyDelete
Replies
UnknownJuly 15, 2019 at 3:37:00 AM PDT
Ok, so how exactly creating unit tests benefits a project in case of regression? If a SOLID prinicples are met, then unit tests won't show regression. On the other hand, integration and E2E tests would. I see TDD as a tool to design a software, not to test it. They force a developer to apply good practices, but if a piece of software is complete and follows good practices, the test will never fail, because if we need to change some feature, we would remove this piece (with tests) and write it from scratch to meet new requirements (and of course provide unit tests for this new piece).
TDD is the option, not the requirement to create a piece of good software, unit tests written after creating a code are useless, so without TDD tere's no need for unit tests (of course I'm still assuming that the software is designed well).
So if we don't do TDD, we won't meet this funny "pyramid" and we shouldn't write ANY tests? That's some serious bullshit...
ReplyDelete
Replies
smackyouNovember 21, 2019 at 10:27:00 AM PST
how would unit tests catch a front end ui workflow error?
ReplyDelete
Replies
UnknownFebruary 26, 2020 at 4:56:00 AM PST
This is a great read. It is difficult to convince your QM on the same as they feel real user like test (end to end) is only the best one to define the quality of software but actually as per the pyramid shown more tests in the lower one makes the quality of software better.
ReplyDelete
Replies
MDecember 17, 2020 at 7:39:00 AM PST
If your E2E Tests aren't fast and reliable. You aren't doing it right.
It's not rocket science. but if you need a better strategy and approach reach out. I'll give you some ideas.
ReplyDelete
Replies

Add comment

New comments are not allowed.

Testing Blog

Just Say No to More End-to-End Tests

84 comments :

Labels

Archive

Feed