Thursday, May 02, 2024

Retrying failing tests?

Doing the same thing repeatedly (and expecting--or hoping for--different results) is apparently a definition of madness.

That might be why trying to debug tests that sometimes fail can feel like it drives you mad.

It's a common problem: how do you know how much logging to have by default? Specifically when running automated tests or as part of a CI process.

Here's my answer:

Run with minimal logging and automatically retry with maximum logging verbosity if something fails.

It's not always easy to configure, but it has many benefits.

Not only does this help with transient issues, but it also helps provide more details to identify the cause of issues that aren't transient.

An issue is probably transient if it fails on the first run but passes on the second (with extra logging enabled.) It also helps identify issues that occur when no logging is configured. -- It can happen; ask me how I know. ;)

Be careful not to hide recurring transient errors. If errors can occur intermittently during testing and the reason is not known, what's to stop them from happening intermittently for end users, too?

Record that a test only passed on the second attempt, and raise a follow-up task to investigate why. Ideally, you want no transient errors or things that don't work when no logging is configured.

This doesn't just apply to tests. 

It can also be applied to build logging verbosity. 

You only really want (or rather need) verbose log output when something goes wrong. Such as a build failing...


Post a Comment

I get a lot of comment spam :( - moderation may take a while.