Once upon a time, I was working with a codebase that contained significant and detailed levels of logging. It supported recording multiple loggers and various levels of details, depending on the required "LogLevel".
While looking at how this code is (and can be) tested, I made a discovery that I hope to bring to many other projects in the future.
Taking a step back, there are a few broad questions to think about in terms of testing and logging:
- Ignoring logging, does the code do everything it is supposed to?
- Are the correct things logged when they should be?
- Does the code do everything it should when there are no loggers registered?
- Does the code do everything it should when there are one or more loggers registered?
- Are we sure that any exceptions in the code are not logged and ignored?
- How does logging code affect the code coverage of tests?
I started thinking about this only when having to deal with a "weird" bug that meant functionality appeared to be working, but the lack of a logger meant a code path failed silently as it was configured to quietly log the exception and then do nothing. :(
My solution: add a wrapper around all tests so that they're run multiple times, with different test-specific loggers attached.
- One test execution had no loggers attached. (To find code that was unexpectedly dependent upon always having at least one logger.)
- One test execution with a logger that listened at all levels but did nothing. (To verify that the presence of a logger had no impact on correct functionality and to ensure that all logging code paths were followed.)
- One test execution with a logger that would cause the test to fail if an error or exception was logged. (Because we shouldn't log errors as part of normal execution.)
This was all on code that ran hundreds of tests in a couple of seconds.
The impact of running multiple times was negligible.
The fact the total number of tests went up considerably was nice. (It also helped people reluctant to write tests feel good as every test they did write was now counted as three, so they felt more productive.)
This gave us more realistic code coverage reports, as the logging paths were now included in the coverage. (Previously, logging code wasn't--mostly--covered, making it harder to identify areas with particularly low coverage.)
Tests for logic, specifically about the loggers or reliant on specific logging behaviour, were put in classes that were excluded from the above.
Adding those different test scenarios found more issues than the one that led me to the original investigation.
I'll definitely do this again.
It's also good to have all these different test and logging combinations configured early in development. Hopefully, by the time you think development is finished, there shouldn't be any exceptions being thrown. It is hard to verify that nothing bad happens when an exception is thrown (and logged). You don't want the handling (logging) of an exception to cause another exception and end up in an infinite loop. These things can be hard to check for, but I've found that having all the different logging combinations available (without me really having to think about them) has made it possible to catch some logging issues I might otherwise have missed.