Oddly enough, I've spent some time working on "white-box" tests that check the log. It takes taste to do well: you have to be careful to log what your program figures out about the domain, rather than just random details. Done right, I find it actually helps refactoring. My unit tests now usually call some top-level function rather than just the sub-component they test. As a result I can make radical changes, like going to client-server or from sync to async, without having to change any tests.
More details: http://akkartik.name/post/tracing-tests