jducoeur: (Default)
[personal profile] jducoeur
By and large, I'm fonder of user-level scenario tests than I am of unit tests -- I think they give you far more bang for the effort buck.

That said, unit tests are often helpful and sometimes crucial. A really good example of such is the explanation for why 30GB Microsoft Zunes all failed on Wednesday. Summary: it was a very dumb code bug that turned into an infinite loop on the last day of any leap year, and just the sort of thing that a well-written unit test suite might well have caught.

Closely related to this: Microsoft is pushing automated testing tools rather hard right now, especially new tools that will automatically write many of the easy and obvious tests. I'd be curious to know whether the Zune was exposed to these tools, and whether they would have caught the bug if so. It's the sort of thing that a reasonably disciplined QE programmer might well consider fairly obvious (test the first, last and somewhere-in-the-middle days for a variety of years), but I could easily see an automated system not getting. It would be useful to know whether the automated tools are good enough to catch this kind of thing...

(no subject)

Date: 2009-01-02 07:48 pm (UTC)
From: [identity profile] metahacker.livejournal.com
Testing's nice. But that is dumb code (a loop instead of division??); and even if writing like that were tolerated, a code review should catch that. And why are they reinventing date handling, anyway?

It's like "reduce, reuse....recycle." 'Test' shouldn't the first resort; it should be the last.

(no subject)

Date: 2009-01-03 01:46 am (UTC)
From: [identity profile] learnedax.livejournal.com
Interesting to me that no one mentioned just using a formula, which seems vastly simpler.

For that matter, the loop's poorly written, even ignoring the bug. Everything this code suggests about the project as a whole is why Microsoft continues to keep such a bad reputation.

And, while I wholly agree that testing is a weaker protection in many places than analysis, this is actually one of the few places where even a dumb automated system could have easily run the function with practically the entire range of inputs, given the constraints on date values.

(no subject)

Date: 2009-01-03 02:20 am (UTC)
From: [identity profile] metahacker.livejournal.com
"No one" meaning no one in the comment thread at the above forum? (...'cause that's what I'm suggesting above.)

Now, the formula is a bit complicated, so it would be trickier to write. But the linked-to code could easily have been on the Daily WTF. A loop with nested exit conditions, plenty o' constants, an arbitrary convention (1980??) to hack around a solved problem? Ya.

On the subject of testing, I'm not sure tests *would* be written to correctly test this code, given a presumption of the coding culture that created it. Tests *could* be written that would do so, but I keep seeing unit tests that interrogate a few points for correct behavior, and then pack up and go home. This is actually why I'm not a huge fan of blindly unit testing, or mandating it -- it's more code that someone has to write and maintain, and it's oh-so-easy to take shortcuts in your tests if you're a grunt programmer. Meanwhile, because the tests pass, people get a false sense of security about the code's function. For all we know this snippet was tested a thousand times, with inputs that weren't the 1-in-1461 'chance' that would cause problems.

Anyway. [/rant] on 'testing as panacea', something I keep hearing (not really from you two), and would like to lodge a complaint against the promulgation of!

(no subject)

Date: 2009-01-03 02:23 am (UTC)
From: [identity profile] metahacker.livejournal.com
Alright, that's amusing. Top story on WTF was, in fact, this same sort of bug...!

(no subject)

Date: 2009-01-03 03:18 am (UTC)
From: [identity profile] learnedax.livejournal.com
Yes, I meant "I'm surprised nobody over there suggested that thing you said, which I also thought."

My argument was actually that automated automated testing could have, in principle, generated a test that would have caught this based solely on inspection of its inputs. It would be a dumb blunt tool, but it would still have worked here even without specific human inspection of the function.

Proving your algorithms is an interesting alternative, though I have never done this in a mathematical sense. At some point proofs fall down because the person writing the proof has the same brain as the person writing the code, and so is likely to commit the same mistakes. You can show your proofs to someone else, but they don't understand what you're doing as well as you, as a rule, so probably won't catch subtle problems.

Tests, on the other hand, fall down in three ways: first, they are always incomplete in practice,* because testing every potentially relevant input variation and exception case is a huge body of work, and when you add in restructuring your code and building a harness such that you can simulate arbitrary behaviors from external dependencies, it becomes unthinkable - simulating system exceptions being a classically painful case. I think this is the most commonly overlooked flaw; the goal is impractical because exhaustive testing is at its base a brute force solution.

Second, not everything can ever be rigorously tested. I always point to concurrency here, but I think a fine case can be made for the impossibility of duplicating many runtime conditions as well.

Third, tests are also code written by humans. If they're written by the same person who wrote the code, they're prone to the same mistakes; if they're written by an independent tester, they're prone to mistakes of insufficient understanding of the implementations sticky points. And whoever writes them, as they grow in complexity with the code they're prone to just as many corner cases as the code they're trying to test.

So, I think you need both analysis and testing. (You can decide the handler for SocketException is alright without simulating it, and you can test the behavior of the search algorithm without proving it.) I think shared ownership of code helps ameliorate some of the problems of both methods. But most of all I think we should give up the pretense that there is any absolute protection if we simply do not code right.

*(I always chuckle when I see a test suite commented with something like "TODO: beef these up".)

Profile

jducoeur: (Default)
jducoeur

October 2025

S M T W T F S
   12 34
567891011
12131415161718
19202122232425
262728293031 

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags