Google Testing Blog: TotT: Data Driven Traps!

TotT: Data Driven Traps!

Thursday, September 04, 2008

const struct {const char* word; bool is_word;} test_data[] = {
  {"milk", true}, {"centre", false}, {"jklm", false},
};

TEST(IsWordTest, TestEverything) {
  for (int i = 0; i < ARRAYSIZE(test_data); i++)
    EXPECT_EQ(test_data[i].is_word, IsWord(test_data[i].word));
}
faster than linearly Locale LOCALES[] = { Word::US, Word::UK, Word::France, ... };
const struct {const char* word; bool is_word[NUM_LOCALES];} data[] = {
  {"milk", {true, true, false, ...}}, // one bool per language
  {"centre", {false, true, true, ...}},
  {"jklm", {false, false, false, ...}}
};

TEST(IsWordTest, TestEverything) {
  for (int i = 0; i < ARRAYSIZE(data); i++)
    for (int j = 0; j < ARRAYSIZE(LOCALES); j++)
      EXPECT_EQ(data[i].is_word[j], IsWord(data[i].word, LOCALES[i]));
}
keeping the data in the test TEST(IsWordTest, IsWordInMultipleLocales) {
  EXPECT_TRUE(IsWord("milk", Word::UK));
  EXPECT_TRUE(IsWord("milk", Word::US));
  EXPECT_FALSE(IsWord("milk", Word::France));
}

TEST(IsWordTest, IsWordWithNonExistentWord) { // 'jklm' test is not repeated
  EXPECT_FALSE(IsWord("jklm", Word::US)); // as it uses the same code path
}
unmanageable

8 comments :

UnknownSeptember 5, 2008 at 12:59:00 AM PDT
A good post on the importance of maintainable tests!
ReplyDelete
Replies
Marlena ComptonSeptember 5, 2008 at 5:32:00 AM PDT
This is an excellent post. It's too easy to use data the wrong way. Just had this discussion with someone yesterday, and I'm putting this on their desk.
ReplyDelete
Replies
Mark RoddySeptember 5, 2008 at 11:39:00 AM PDT
I totally agree based on the example here. I wonder what you would propose for an alternative example. Where I work we keep 'models' in separate files from application source that uses them, and they all exist in a single directory. We have a rule that all model files must pass tests x, y, z ( I believer there's 12-15 of them). To accomplish this, there's a single test that iterates over each file in the directory and performs the same checks on each them.

Note that this is a simplified explanation, but this does allow us to add new 'models' over time without having to add tests for each one.
ReplyDelete
Replies
MallikSeptember 8, 2008 at 1:00:00 AM PDT
This comment has been removed by the author.
ReplyDelete
Replies
MallikSeptember 8, 2008 at 1:02:00 AM PDT
How can google which claims best Testing team lose out on this basic security vulnerabilities?? Check out here
ReplyDelete
Replies
cuquiSeptember 8, 2008 at 11:11:00 AM PDT
BUG : LOCALES[i] should be LOCALES[j]
ReplyDelete
Replies
UnknownOctober 26, 2008 at 10:27:00 PM PDT
So... you've picked a test where the test logic is a single assert, and shown that if the data becomes complex and you don't have any diagnostic information, you end up with an unmaintainable test. Or if you shove together two different types of tests (a series of positive tests and one negative one that gets repeated if you stuff it in the loop), you're doing useless work. OK, fair enough. But you go from there to the conclusion that data-driven testing is bad, and I don't buy it.

What about a situation where you're test logic is several lines? How large before the cutting and pasting is a more likely source of bugs than the data structure? Why not be sensible about defining the data and not mix positive and negative tests? Why not be sensible about error reporting and include an informative message with each line of data?

Yes, unit tests in particular tend towards minimalist logic, and in such cases pulling out the data doesn't buy you much, and can obscure things as you add cruft to compensate for the code's generality. But just because you can construct a test case that is a poor fit for the data-driven methodology does not prove that that methodology is universally unsound.
ReplyDelete
Replies
David LonieJuly 19, 2013 at 6:33:00 AM PDT
Maybe this is true with GTest, where rolling your own data-driven testing structures is necessary. Other testing frameworks build support for data-driven tests into the testing API in ways that are useful and easy to debug. I've had a lot of success writing, maintaining, and catching/fixing bugs using Qt's data-driven testing framework:

http://qt-project.org/doc/qt-4.8/qtestlib-tutorial2.html

When a test fails, it prints a human-readable description of the failing test case that is easy to correlate to the inputs, along with the expected and actual values that triggered the failure. Their implementation is easy to write, maintain, extend, and use. Don't knock the concept, improve the implementation ;-)
ReplyDelete
Replies

Add comment

The comments you read and contribute here belong only to the person who posted them. We reserve the right to remove off-topic comments.

Testing Blog

TotT: Data Driven Traps!

8 comments :

Labels

Archive

Feed