Google Testing Blog: Writing Testable Code

Writing Testable Code

Wednesday, August 06, 2008

Miško Hevery

Mixing object graph construction with application logic: In a test the thing you want to do is to instantiate a portion (ideally just the class under test) of your application and apply some stimulus to the class and assert that the expected behavior was observed. In order to instantiate the class in isolation we have to make sure that the class itself does not instantiate other objects (and those objects do not instantiate more objects and so on). Most developers freely mix the "new" operator with the application logic. In order to have a testable code-base your application should have two kinds of classes. The factories, these are full of the "new" operators and are responsible for building the object graph of your application, but don't do anything. And the application logic classes which are devoid of the "new" operator and are responsible for doing work. In test we want to test the application logic. And because the application logic is devoid of the "new" operator, we can easily construct an object graph useful for testing where we can strategically replace the real classes for test doubles. (see: How to Think About the “new” Operator with Respect to Unit Testing)

Ask for things, Don't look for things (aka Dependency Injection / Law of Demeter): OK, you got rid of your new operators in your application code. But how do I get a hold of the dependencies. Simple: Just ask for all of the collaborators you need in your constructor. If you are a House class then in your constructor you will ask for the Kitchen, LivingRoom, and BedRoom, you will not call the "new" operator on those classes (see 1). Only ask for things you directly need, If you are a CarEngine, don't ask for FuelTank, only ask for Fuel. Don't pass in a context/registry/service-locator. So if you are a LoginPage, don't ask for UserContext, instead ask for the User and the Athenticator. Finally don't mix the responsibility of work with configuration, If you are an Authenticator class don't pass in a path of the configuration information which you read inside the constructor to configure yourself, just ask for the configuration object and let some other class worry about reading the object from the disk. In your tests you will not want to write a configuration into a disk just so that your object can read it in again. (see: Breaking the Law of Demeter is Like Looking for a Needle in the Haystack )

Doing work in constructor: A class under tests can have tens of tests. Each test instantiates a slightly different object graph and than applies some stimulus and asserts a response. As you can see the most common operation you will do in tests is instantiation of object graphs, so make it easy on yourself and make the constructors do no work (other than assigning all of the dependencies into the fields). Any work you do in a constructor, you will have to successfully navigate through on every instantiation (read every test). This may be benign, or it may be something really complex like reading configuration information from the disk. But it is not just a direct test for the class which will have to pay this price, it will also be any related test which tries to instantiate your class indirectly as part of some larger object graph which the test is trying to create.

Global State: Global state is bad from theoretical, maintainability, and understandability points of view, but is tolerable at run-time as long as you have one instance of your application. However, each test is a small instantiation of your application in contrast to one instance of application in production. The global state persists from one test to the next and creates mass confusion. Tests run in isolation but not together. Worse yet, tests fail together but problems can not be reproduced in isolation. Order of the tests matters. The APIs are not clear about the order of initialization and object instantiation, and so on. I hope that by now most developers agree that global state should be treated like GOTO.

Singletons (global state in sheep's clothing): It amazes me that many developers will agree that global state is bad yet their code is full of singletons. (Singletons which enforce their own singletoness through private constructor and a global instance variable) The core of the issue is that the global instance variables have a transitive property! All of the internal objects of the singleton are global as well (and the internals of those objects are global as well... recursively). Singletons are by far the most subtle and insidious thing in unit-testing. I will post more blogs on this topic later as I am sure it will create comments from both sides.

Static methods: (or living in a procedural world): The key to testing is the presence of seams (places where you can divert the normal execution flow). Seams are essentially polymorphism (Polymorphism: at compile-time the method you are calling can not be determined). Seams are needed so that you can isolate the unit of test. If you build an application with nothing but static methods you have procedural application. Procedural code has no seams, at compile-time it is clear which method calls which other method. I don't know how to test application without seams. How much a static method will hurt from a testing point of view depends on where it is in you application call graph. A leaf method such as Math.abs() is not a problem since the execution call graph ends there. But if you pick a method in a core of your application logic than everything behind the method becomes hard to test, since there is no way to insert test doubles (and there are no seams). Additionally it is really easy for a leaf method to stop being a leaf and than a method which was OK as static no longer is. I don't know how to unit-test the main method!

Favor composition over inheritance: At run-time you can not chose a different inheritance, but you can chose a different composition, this is important for tests as we want to test thing in isolation. Many developers use inheritance as code reuse which is wrong. Whether or not inheritance is appropriate depends on whether polymorphism is going on. Inheriting from AuthenticatedServlet will make your sub-class very hard to test since every test will have to mock out the authentication. This will clutter the focus of the test, with the things we have to do to successfully navigate the super class. But what if AuthenticatedServlet inherits from DbTransactionServlet? (that gets so much harder)

Favor polymorphism over conditionals: If you see a switch statement you should think polymorphisms. If you see the same if condition repeated in many places in your class you should again think polymorphism. Polymorphism will break your complex class into several smaller simpler classes which clearly define which pieces of the code are related and execute together. This helps testing since a simpler/smaller class is easier to test.

Mixing Service Objects with Value Objects: There should be two kinds of objects in your application. (1) Value-objects, these tend to have lots of getters / setters and are very easy to construct are never mocked, and probably don't need an interface. (Example: LinkedList, Map, User, EmailAddress, Email, CreditCard, etc...). (2) Service-objects which do the interesting work, their constructors ask for lots of other objects for colaboration, are good candidates for mocking, tend to have an interface and tend to have multiple implementations (Example: MailServer, CreditCardProcessor, UserAthenticator, AddressValidator). A value-object should never take a service object in its constructor (since than it is not easy to construct). Value-objects are the leafs of your application graph and tend to be created freely with the "new" operator directly in line with your business logic (exception to point 1 since they are leafs). Service-objects are harder to construct and as a result are never constructed with a new operator in-line, (instead use factory / DI-framework) for the object graph construction. Service-objects don't take value-objects in their constructors since DI-frameworks tend to be unaware about the how to create a value-object. From a testing point of view we like value-objects since we can just create them on the fly and assert on their state. Service-objects are harder to test since their state is not clear and they are all about collaboration and as a result we are forced to use mocking, something which we want to minimize. Mixing the two creates a hybrid which has no advantages of value-objects and all the baggage of service-object.

Mixing of Concerns: If summing up what the class does includes the word "and", or the class would be challenging for new team members to read and quickly "get it", or the class has fields that are only used in some methods, or the class has static methods that only operate on parameters than you have a class which mixes concerns. These classes are hard to test since there are multiple objects hiding inside of them and as a result you are testing all of the objects at once.

32 comments :

Jonathan PryorAugust 6, 2008 at 10:16:00 AM PDT
I don't know how to unit-test the main method!

Obviously you have no imagination. You test the same method the same way you test static methods: provide known input, and test that the output is what you expect given the input.

Which is exactly the same way you'd test Math.abs() and numerous other static methods. (Which is exactly the way I test C# extension methods, e.g. http://anonsvn.mono-project.com/source/branches/rocks-playground/Tests/Mono.Rocks.Tests/IEnumerableTest.cs)

To make this "easy", you should attempt to mandate that static methods operate on no mutable state -- following the idioms of functional programming -- or provide some mechanism to reset the global state that the method operates on to a known value.

Remember: global state is also an input to the static method if the method uses the global state.

You can, of course, take this to a "higher level": console input and output is also global state, so if your main method reads from or writes to the terminal, you can have a shell-based unit test which invokes your program (with some set of arguments) and compares the programs output to what was actually generated.

This isn't difficult. Tedious, sure, but not difficult, and extremely useful to do when first starting to write unit tests for a pre-existing project (which may not have used any OO design methodology to begin with).

Just remember the first rule of testing any unit of code (method, library, program, operating system...): check the code's output against known input and compare against known-good output. The rest are details.
ReplyDelete
Replies
UnknownAugust 7, 2008 at 3:05:00 AM PDT
"[If your] class has static methods that only operate on parameters [then] you have a class which mixes concerns"

I have a class named Importer which has a static method:
Customer ReadCustomer(IDataRecord record)

This method only operates on its parameter. It creates a new Customer object from the IDataRecord.

Could you please describe how this class 'mixes concerns'? I'm not entirely sure how it is doing so.
ReplyDelete
Replies
John WebberAugust 7, 2008 at 5:17:00 AM PDT
Miško, in the paragraph about static methods you refer to "seems" - do you mean "seams?" Otherwise I'm not sure what you mean...
ReplyDelete
Replies
Michael RogerAugust 7, 2008 at 8:23:00 AM PDT
Sponge, I believe the point is that any pure method (that relies only on its parameters) can be separated into a pure utility class, leaving less code that depends on the class members.

"Separation of concerns" didn't put a fine point on it, but the goal is to eliminate uneccessary dependencies, much like the Fuel/FuelTank example earlier.
ReplyDelete
Replies
UnknownAugust 7, 2008 at 8:32:00 AM PDT
If you get past a number of typos and the non-working links, it looks like a hastily written, non-proofed copy.
ReplyDelete
Replies
UnknownAugust 7, 2008 at 8:43:00 AM PDT
Excellent article! I love seeing these types of topics from this blog.

Keep them coming.
ReplyDelete
Replies
Max CantorAugust 7, 2008 at 12:14:00 PM PDT
You wrote:

If you see a switch statement you should think polymorphisms. If you see the same if condition repeated in many places in your class you should again think polymorphism.

Could you perhaps give a little code snippet that exemplifies this? That is, what would a switch statement "look like" when turned into a polymorphic solution instead?
ReplyDelete
Replies
UnknownAugust 7, 2008 at 1:07:00 PM PDT
Max "Could you perhaps give a little code snippet that exemplifies this?"

Assume you have a class named "Message" with a method named "sendMessage()" that contains this fragment :

if (this.messageType = messageTypes.Fax) {
...
}

then you should probably have a FaxMessage class that inherits from Message and overrides sendMessage().
ReplyDelete
Replies
Imran FanaswalaAugust 7, 2008 at 1:38:00 PM PDT
@Max,

Let's say you are developing a chat service that combines various chat protocols, and you want to implement logout() functionality.

Switch version:
http://pastebin.com/f23bf0c6e

Polymorphic version:
http://pastebin.com/fae7a445

Admittedly, the above example is not the most apt but it is based of a real world refactoring I did a while ago. If you want a more typical example, try http://hanuska.blogspot.com/2006/08/swich-statement-code-smell-and.html

Regards,
Imran Fanaswala
ReplyDelete
Replies
GrobyAugust 7, 2008 at 1:45:00 PM PDT
Hm. You have me stumped with the "ask, don't look for" advice.

Who exactly would I be asking for the objects I need if I *don't* pass in a ServiceLocator? A global service locator instead?
ReplyDelete
Replies
heliosAugust 7, 2008 at 1:50:00 PM PDT
More about polymorphism:

maybe examples on how to use polymorphism can be found in the strategy pattern (Gamma 1994). It happens a lot of time (the most when your building some kind of tool/framework) you need to do a generic work (loading database field values into an object, by example), but this generic thing has it's specialties (converting types, converting special values, getting the value, etc). That's when you delegate that special thing (converter.getValueFromBD(...)) and use polymorphism to implement it in various ways (IntegerConverter, ClobToStringConverter, StringConverter, etc, etc, etc).

By the way: strategy is a pattern like template-method except that it preffers composition [using a strategy] than inheritance [overriding the template method]. It illustrate the point about composition vs. inheritance.
ReplyDelete
Replies
heliosAugust 7, 2008 at 2:01:00 PM PDT
groby:

asking:

There are two ways. None of them use a ServiceLocator.

1) receiving what you need directly in the constructor/setter. I mean: if you need something let your client code give it to you.

new Car(FuelTank)

2) receiving a Provider (NOT A SERVICELOCATOR).
It's a subttle diferrence:

A ServiceLocator is bad because it's a general/global entry-point. It's application-aware, not component-aware. So your code becomes less independent.

A Provider (as I propose it) is an interface defined by the very class you are writing and who implements it ONLY does that.

example:

new Engine(FuelProvider)

as I cannot give fuel to the engine directly on the constructor (it's going to ask for it when it runs) I have to give it when the class wants. So that class (Engine) defines the interface FuelProvider:

examples.car.engine.Engine
examples.car.engine.FuelProvider

and somebody implement it how he/she wants:

examples.car.NormalFuelProvider implements
examples.car.engine.FuelProvider
{
// uses FuelTank
}

NormalFuelProvider is more application-dependant that FuelProvider interface. It's fine because it's created for wiring things in an application (assume car is the application). At least, in the example is something that's made for creating a car with an Eingne and a FuelTank.

Getting back to the ServiceLocator what I mean is you can implement a Provider that's aware of the ServiceLocator but your class Engine not.

If you want to test it you can give an AllTheFuelYouWantProvider that do mock-like stuff so you can concentrate on the Engine. You don't have to worry about any ServiceLocator (or FuelTank), don't have to worry about application-like code, and your tests are happier.
ReplyDelete
Replies
ihathAugust 7, 2008 at 4:27:00 PM PDT
interesting article
ReplyDelete
Replies
Max CantorAugust 8, 2008 at 12:11:00 PM PDT
@Imran and @Michel:

Thanks! This seems like something I've done for a while without knowing what it was called. I appreciate the explanation!
ReplyDelete
Replies
Dave KirbyAugust 8, 2008 at 12:23:00 PM PDT
This is an interesting article, but the list is rather confusing. Some of the titles are things that you SHOULD do, and some are things that you should NOT do. It is only when you read the body of the list that it becomes clear (and sometimes even then it is not always obvious). I suggest prepending each list item with DO or DONT to make it clear, or rewriting them to make them consistent.

To be specific, all of the points are things you should NOT do, except for 2, 7 and 8, which you SHOULD do.
ReplyDelete
Replies
UnknownAugust 26, 2008 at 4:38:00 PM PDT
If you have a spring project and you want to test a dao - how do you do that *easily*.

Everyone just says, "use dbunit".

But I don't see how that makes it easy.
ReplyDelete
Replies
Ran TavoryJune 28, 2009 at 6:23:00 AM PDT
About Singletons, I completely agree. I posted this article on my blog: http://prettyprint.me/2009/06/24/beware-of-the-singleton/
ReplyDelete
Replies
Edward HuynhSeptember 2, 2009 at 11:38:00 PM PDT
Excellent topic and content. Keep them coming
ReplyDelete
Replies
Geshan ManandharMarch 22, 2010 at 7:34:00 AM PDT
If I could wrap my head around it, I"ll try :).
ReplyDelete
Replies
jenniferMay 14, 2010 at 5:03:00 AM PDT
it's good to see this information in your post, i was looking the same but there was not any proper resource, thanx now i have the link which i was looking for my research.
ReplyDelete
Replies
Kalle KarlbergMay 21, 2011 at 3:05:00 AM PDT
Great article. I tried for one hour+ to make this point to a colleage y-day. I'll just send her this article.
ReplyDelete
Replies
MikeDecember 2, 2011 at 4:03:00 PM PST
Good article, most points I agree with. However, there is a strong Java/C# bias in your explanations (or pure OOP bias), and it might be good to stress that fact as some points don't apply as strongly to other languages. Also, this statement made be jump out of my chair:

"Polymorphism: at compile-time the method you are calling can not be determined"

You need to review your basic software engineering knowledge. The definition you give pertains to "dynamic binding", not polymorphism. Polymorphism is essentially the concept of substitutability of types. In other words, being able to substitute one object-type for another and be able to reuse the same method that uses either types is the key aspect of polymorphism (i.e. the object types are polymorphic types and the functions using them are polymorphic functions). This does NOT require dynamic binding, it is only if you need to make this substitution at run-time that dynamic binding is required. Haven't you heard of something called "static polymorphism" (i.e. polymorphism that is realized at compile-time). This also exemplifies my point about the strong OOP bias in your article. I write an enormous amount of procedural code (in C++), heavily relying on static polymorphism, and thus, has a lot of "seams" making the construction of unit-tests very easy (and generally sticking to state-less functional programming is also a huge benefit in that regard). Certainly, in OOP, good use of dynamic polymorphism will make unit-testing a lot easier, but this does not mean that procedural programming (that doesn't use dynamic polymorphism) is hard to unit-test, it's just hard to do so if you suck at writing procedural code.

So overall, I would say that point 6 should be taken out (or left with a strong caveat that this applies only to highly restrictive programming languages that don't allow you to achieve static polymorphism). Then, points 4 and 5 are the same (do you really think anyone who knows what a Singleton is doesn't know that it's forms a global state). Point 3 only applies to languages with reference-semantics with non-deterministic construction models (Java/C# and the like), in a language like C++, this point is mute (unless you are emulating Java/C# style OOP in C++, which is just a very bad style of C++). The remaining points are good IMO.

Just my grain of salt.
ReplyDelete
Replies
UnknownJuly 20, 2012 at 5:25:00 AM PDT
Do you test the code that creates the object graph? If so, how?
ReplyDelete
Replies
MisrajiDecember 29, 2012 at 1:42:00 PM PST
How does one deal with dynamically created objects (say) like objects representing email?
ReplyDelete
Replies
UnknownAugust 21, 2013 at 7:57:00 AM PDT
I guess "branches" would be the correct term as opposed to "seams".
ReplyDelete
Replies
Robert PeszekOctober 8, 2014 at 6:20:00 PM PDT
Continuing on the OOP bias point (Mike December 2, 2011).
FP is becoming more and more popular and OO languages allow for more and more of a FP/OOP mix.

If you think about this list in therms of FP, all points are N/A!

In this context, I would like to suggest a shorter list to produce a much more testable code:
1) limit side-effects
2) isolate side-effects

Also, in the context of FP, I very much disagree with point 6. The term 'method' is somewhat redeeming but...
A true FP language like Haskell will have a very 'static' feel to it. In OOP/FP hybrid if you want to code with functions instead of methods these functions will need to have static presence too. For example: if 'reduce' is a function that accept an accumulator and a list (instead of being a method owned by the list), it has to exist outside of the list object. I took the approach of using lots of static definitions when designing fpiglet (FP library for Groovy) and I believe it to be some of the most testable code I ever done within the boundaries of an OO language.
ReplyDelete
Replies
UnknownOctober 22, 2016 at 12:31:00 PM PDT
RDM vs. ADM
----------------------------------------------------------------

If you separate logic from data in an Object Oriented language, you actually fall back to procedural programming. (A.K.A, by Fowler, "ADM")

Many companies do that for the sake of testing.

However, if you still want to write real Object Oriented and benefit a rich domain model:

1. Separate your project to modules.
2. Each module will have a thin service layer made of stateless logic.
3. Use "new" to instantiate business objects.
4. Use service locator or IoC container to resolve services provided by other modules.

ReplyDelete
Replies
MPursleyOctober 9, 2018 at 5:19:00 PM PDT
Thanks for the article. Awesome tips!

While reading through it, I found some things that sound like typos or minor grammatical errors? I'll post them here, in case you have time/interest to fix them. :) I'll post them into the copy on your blog too.

Thanks!
Matt

___
- "OK, you got rid of your new operators in you application code." ->
"OK, you got rid of your new operators in your application code."

- "Global state is bad from theoretical, maintainability, and understandability point of view," ->
"Global state is bad from theoretical, maintainability, and understandability points of view,"

- "The core of the issue is that the global instance variables have transitive property!" ->
"The core of the issue is that the global instance variables have a transitive property!"

- "Seams are needed so that you can isolate the unit of test." ->
"Seams are needed so that you can isolate the unit test."

- "If you build an application with nothing but static methods you have procedural application." ->
"If you build an application with nothing but static methods you have a procedural application."

- "I don't know how to test application without seams." ->
"I don't know how to test an application without seams."

- "How much a static method will hurt from a testing point of view depends on where it is in you application call graph." ->
"How much a static method will hurt from a testing point of view depends on where it is in your application call graph."

- "At run-time you can not chose a different inheritance, but you can chose a different composition, this is important for tests as we want to test thing in isolation." ->
"At run-time you can not choose a different inheritance, but you can choose a different composition, this is important for tests as we want to test a thing in isolation."

- "This will clutter the focus of test" ->
"This will clutter the focus of the test"

- "This helps testing since simpler/smaller class is easier to test." ->
"This helps testing since a simpler/smaller class is easier to test."

- "If summing up what the class does includes the word "and", or class would be challenging for new team members to read and quickly "get it", or class has fields that are only used in some methods, or class has static methods that only operate on parameters than you have a class which mixes concerns." ->
"If summing up what the class does includes the word "and", or the class would be challenging for new team members to read and quickly "get it", or the class has fields that are only used in some methods, or class has static methods that only operate on parameters than you have a class which mixes concerns."

- "These classes are hard to tests since there are multiple objects hiding inside of them and as a resulting you are testing all of the objects at once." ->
"These classes are hard to tests since there are multiple objects hiding inside of them and as a result, you are testing all of the objects at once."
___

ReplyDelete
Replies
Google Testing BloggersOctober 31, 2018 at 7:17:00 AM PDT
Fixed! Thanks!
ReplyDelete
Replies