Interesting technique. I think the article will be improved if it also mentioned that the technique comes with a cost of adding an extra unintuitive parameter into your code base. All subsequent maintainers of the code will probably find it very confusing to see that the path_checker is a parameter which makes the codebase less maintainable.
Is that really any better? Now the function signature is polluted with arguments for everything that I might want to use this technique for -- which could be considerable.
Think pretty much anything in this area quickly turns into an ugly hack but modifying Foo's parameters is probably worse than others.
A transparent solution can be done via inspecting the stack and checking who the caller is (via python's inspect module) - if it's the function you want to test, using the dummy os.path.exists, otherwise use the real one. Would provide an example but whitespace formatting not supported the the comments here...
That said surely DoSomething and DoSomethingElse should be stubbed out as well? It's only Foo being tested ;)
I don't use python much, but I support the technique of accepting stubbing as a worthwhile evil and designing it into the code. It's not really "polluting" the interface" because it IS the interface.
What about extracting the os.path.exists into a helper that is injected at construction of the class containing Foo? This way your not polluting the parameter list but still providing a separation of concerns.
I think this issue -- at least the particular one you give -- is a sign of a smell further up in the stack. If you really are doing lots of file operations, you should really write files. The stubbing and mocking you'll have to do to avoid it just isn't worth it, and automatically wiping your scratch test files before the test runs is easy enough as well.
Once you are comfortable with the testing of the file-related operations, you shouldn't keep testing them. Either stub those out, or just keep using scratch areas for the files.
Similarly, if you rely on some other external service (the service in this case being the filesystem), putting in a trivial service is often better than futzing with replacing functions that are otherwise quite sufficient. A good example of a tool that allows this is wsgi_intercept, which lets you attach pretend HTTP apps to arbitrary host names. Of course, that's basically the original technique that you wanted to avoid; that someone else did it in a library kind of makes it okay. If you were mocking something that wasn't yet mocked, I'd recommend something more like your technique.
Putting in a trivial service implementation should just be a matter of configuration. Configuring an address to http://localhost, or configuring the base file path, or pointing to a fake smtp server. You have to do that anyway regardless of testing, and that's a good place for your mocking.
I hope this goes without saying, but in production code the filesystem should be considered untrusted and an assertion about its contents would never be valid. Instead all possible failures should be handled. (Unless you *are* the filesystem.) I am assuming this is just for the sake of example though.
I often use class attributes instead of polluting function signatures. There's just one little trick: if you want to assign a function to a class attribute without turning it into a method, you need to wrap it in staticmethod():
class Whatever(object): # hook for unit tests path_checker = staticmethod(os.path.exists)
By the way, could you please refrain from inflicting the PEP-8-violating 2-space-indentation internal Google coding style on the rest of the world? Thanks!
Hey, there's a nicer way of setting a mock object without having it passed in as a parameter.
At least, it works in C++ and Java - I'm not sure how OO works in Python because I know nothing about it.
In the class you want to test, say you initialise path_tester = os.file.exists; in a constructor. We replace this with a protected factory method call: path_tester = getPathTester();
Then we implement our protected factory method: protected getPathTester() { return os.file.exists; }
But when we need to mock it, we create an anonymous inner class which inherits from the original, overriding getPathTester() to return our mock object that says yes or no or whatever (and perhaps, records that it was called x times). I found this technique on IBM's developerworks somewhere. Hopefully it makes sense in Python as well...
I avoid having to add code to my production implementation to support the mock objects and expand the mock object to return match paths and to use the underlying implementation as a fall-through case. As a bonus, you can chain multiple mock objects together to mock-out several paths.
It is pretty trivial to expand my example to take either a list of paths or a function to determine whether the provided path matches.
My example looks really ugly in the comments, but you can find it on pastebin.com here
It's interesting. Actually the same trick are among nearlly all dynamic langugages: JavaScript, ActionScript, Python, Ruby, etc. Just a mehtod/property overriding.
It's a python only technique. For those who don't use python, using optional parameters with default value is never a nice idea. For one parameter, it does a great job, but once you start using two unrelated optionnal value, it's start to be a mess, and optionnal parameters must be avoid.
With something like that, in C/C++/Java, you can call : Foo(path) Foo(path,my_foo_checker) Foo(path,my_foo_checker,my_bar_checker)
if you want to pass Foo(path,my_bar_checker), you can't with C/C++/Java, or you need to do : Foo(path,default_foo_ckecker,my_bar_checker)
Erk ? Why would I need to know what is the default foo_checker ? Why would I even need to know that there is a foo_checker argument for another obscur test I don't evn want to know about.
but you can also write : Foo(path,foo_checker=my_foo_checker) Foo(path,bar_checker=my_bar_checker) Foo(path,foo_checker=my_foo_checker,bar_checker=my_bar_checker)
And even this call works: Foo(path,bar_checker=my_bar_checker,foo_checker=my_foo_checker)
So for all those who don't know python and don't feel easy with optionnal parameters : You are right ! Optionnal parameters in C/C++/Java is not really something to abuse, it's not the same with python.
That said, I won't have done it that way. I still agree with william, it would still be confusing.
I would have done it that way :
First, I would have used an "IOObject" to do that kind of IO things, to be able to change IOComponent.
Then, I would have created that IO object once (per class or per instance, it depends), and finally, I would have changed that IOObject (with for exemple a child) for the test.
Note that it's more or less what marius is doing. If you still want to use your optionnal parameter, you can pass IOObject to the constructor.
To oisin : Yes, it would work the same with python. I still prefer Marius technique (the object is create as a static attribute, and only the test unit can change that attribute, rather than subclassing, because I'm not easy with testing a subclass of the class I really want to test. While I agree it's technically exactly the same)
Interesting technique. I think the article will be improved if it also mentioned that the technique comes with a cost of adding an extra unintuitive parameter into your code base. All subsequent maintainers of the code will probably find it very confusing to see that the path_checker is a parameter which makes the codebase less maintainable.
ReplyDeleteIs that really any better? Now the function signature is polluted with arguments for everything that I might want to use this technique for -- which could be considerable.
ReplyDeleteThink pretty much anything in this area quickly turns into an ugly hack but modifying Foo's parameters is probably worse than others.
ReplyDeleteA transparent solution can be done via inspecting the stack and checking who the caller is (via python's inspect module) - if it's the function you want to test, using the dummy os.path.exists, otherwise use the real one. Would provide an example but whitespace formatting not supported the the comments here...
That said surely DoSomething and DoSomethingElse should be stubbed out as well? It's only Foo being tested ;)
Another possibility is to make the stub check the argument, return a value for 'bar' and call the original function for anything else.
ReplyDeleteos.path.exists = lambda x: x == "bar" or old_exists(x)
Sorry if my code isn't much cop, I don't use Python much.
Cool!
ReplyDeleteI don't use python much, but I support the technique of accepting stubbing as a worthwhile evil and designing it into the code. It's not really "polluting" the interface" because it IS the interface.
Hey,
ReplyDeleteHow about posting an RSS feed for the PDFs? I could hook it up via a script to print them directly to my printer at the office :)
Cheers
Is Foo just a helper method for your unit test class? If so, I can see value in the new parameter but if not, it does seem like a hack.
ReplyDeleteFull disclosure: I'm pretty new to Python so maybe I'm missing something.
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteWhat about extracting the os.path.exists into a helper that is injected at construction of the class containing Foo? This way your not polluting the parameter list but still providing a separation of concerns.
ReplyDeleteI think this issue -- at least the particular one you give -- is a sign of a smell further up in the stack. If you really are doing lots of file operations, you should really write files. The stubbing and mocking you'll have to do to avoid it just isn't worth it, and automatically wiping your scratch test files before the test runs is easy enough as well.
ReplyDeleteOnce you are comfortable with the testing of the file-related operations, you shouldn't keep testing them. Either stub those out, or just keep using scratch areas for the files.
Similarly, if you rely on some other external service (the service in this case being the filesystem), putting in a trivial service is often better than futzing with replacing functions that are otherwise quite sufficient. A good example of a tool that allows this is wsgi_intercept, which lets you attach pretend HTTP apps to arbitrary host names. Of course, that's basically the original technique that you wanted to avoid; that someone else did it in a library kind of makes it okay. If you were mocking something that wasn't yet mocked, I'd recommend something more like your technique.
Putting in a trivial service implementation should just be a matter of configuration. Configuring an address to http://localhost, or configuring the base file path, or pointing to a fake smtp server. You have to do that anyway regardless of testing, and that's a good place for your mocking.
I hope this goes without saying, but in production code the filesystem should be considered untrusted and an assertion about its contents would never be valid. Instead all possible failures should be handled. (Unless you *are* the filesystem.) I am assuming this is just for the sake of example though.
ReplyDeleteI often use class attributes instead of polluting function signatures. There's just one little trick: if you want to assign a function to a class attribute without turning it into a method, you need to wrap it in staticmethod():
ReplyDeleteclass Whatever(object):
# hook for unit tests
path_checker = staticmethod(os.path.exists)
def foo(self):
return self.path_checker(self.filename)
By the way, could you please refrain from inflicting the PEP-8-violating 2-space-indentation internal Google coding style on the rest of the world? Thanks!
Very cool and great idea. I'll print this and send to may dev people...
ReplyDeleteRegards!
Caution about the python lambda. Word on the street -- it is going away.
ReplyDeleteHey, there's a nicer way of setting a mock object without having it passed in as a parameter.
ReplyDeleteAt least, it works in C++ and Java - I'm not sure how OO works in Python because I know nothing about it.
In the class you want to test, say you initialise path_tester = os.file.exists; in a constructor. We replace this with a protected factory method call:
path_tester = getPathTester();
Then we implement our protected factory method:
protected getPathTester() {
return os.file.exists;
}
But when we need to mock it, we create an anonymous inner class which inherits from the original, overriding getPathTester() to return our mock object that says yes or no or whatever (and perhaps, records that it was called x times).
I found this technique on IBM's developerworks somewhere. Hopefully it makes sense in Python as well...
I use this technique a lot in my tests.
ReplyDeleteI avoid having to add code to my production implementation to support the mock objects and expand the mock object to return match paths and to use the underlying implementation as a fall-through case. As a bonus, you can chain multiple mock objects together to mock-out several paths.
It is pretty trivial to expand my example to take either a list of paths or a function to determine whether the provided path matches.
My example looks really ugly in the comments, but you can find it on pastebin.com here
That's interesting. Actually, every dynamic language(JavaScript, ActionScript, Ruby, python...) can do that, something like method overriding.
ReplyDeleteIt's interesting. Actually the same trick are among nearlly all dynamic langugages: JavaScript, ActionScript, Python, Ruby, etc. Just a mehtod/property overriding.
ReplyDeleteIt's a python only technique. For those who don't use python, using optional parameters with default value is never a nice idea. For one parameter, it does a great job, but once you start using two unrelated optionnal value, it's start to be a mess, and optionnal parameters must be avoid.
ReplyDeletedef Foo(path, foo_checker=default_foo_checker, bar_checker=default_bar_checker, ):
# ...
pass
With something like that, in C/C++/Java, you can call :
Foo(path)
Foo(path,my_foo_checker)
Foo(path,my_foo_checker,my_bar_checker)
if you want to pass Foo(path,my_bar_checker), you can't with C/C++/Java, or you need to do :
Foo(path,default_foo_ckecker,my_bar_checker)
Erk ? Why would I need to know what is the default foo_checker ? Why would I even need to know that there is a foo_checker argument for another obscur test I don't evn want to know about.
With pythonic code, you can still write :
Foo(path)
Foo(path,my_foo_checker)
Foo(path,my_foo_checker,my_bar_checker)
but you can also write :
Foo(path,foo_checker=my_foo_checker)
Foo(path,bar_checker=my_bar_checker)
Foo(path,foo_checker=my_foo_checker,bar_checker=my_bar_checker)
And even this call works:
Foo(path,bar_checker=my_bar_checker,foo_checker=my_foo_checker)
So for all those who don't know python and don't feel easy with optionnal parameters : You are right ! Optionnal parameters in C/C++/Java is not really something to abuse, it's not the same with python.
That said, I won't have done it that way. I still agree with william, it would still be confusing.
I would have done it that way :
First, I would have used an "IOObject" to do that kind of IO things, to be able to change IOComponent.
Then, I would have created that IO object once (per class or per instance, it depends), and finally, I would have changed that IOObject (with for exemple a child) for the test.
Note that it's more or less what marius is doing. If you still want to use your optionnal parameter, you can pass IOObject to the constructor.
To oisin : Yes, it would work the same with python. I still prefer Marius technique (the object is create as a static attribute, and only the test unit can change that attribute, rather than subclassing, because I'm not easy with testing a subclass of the class I really want to test. While I agree it's technically exactly the same)
Why isn't dosomething stubbed out?
ReplyDeletePlease link pre-req and sequel episodes/articles, so that beginners like me can first get to know the terms and concepts discussed in suh episodes.
ReplyDelete