Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to unit test code that depends on time (playfulprogramming.blogspot.com)
145 points by ingve on Dec 20, 2023 | hide | past | favorite | 139 comments


Time is a dependency like any other, so use dependency inversion. (Which I think is advocated by TFA’s approaches 4 and 5.)

Define an interface for timers, which is injected into the object that depends on time. Use a wrapper implementation over the usual timer for the real code; and a simple implementation, which can be controlled parametrically, in the tests.


Dependency injection of just the function you need is good, but is still the second best approach.

The best approach is separating the calling of the external dependency (getting the time) to it's own function, and pass in the results of that call to your main function. Then test your main function. Ie approach 5. This eliminates the need for mocks and creates better factored code.


That works until you have a function that implements timeout and so needs to query time multiple times per invocation.


Agreed, so the preference order is 1. Pass in the data you need 2. Pass in a function that gives you the data you need 3a./3b. Pass in an object (or function like a factory) that gives you the function that gets you the data you need.


> Pass in a function that gives you the data you need

That's basically dependency injection but on a per-method basis. I'd see this as an anti pattern because it makes function calling _way_ more complex and error-prone.


> it makes function calling

First: no, not necessarily. Second: when you limit yourself to function calling, then yes, the problem is more pronounced (but not always there, see 1)

In OOP, you can have factories, or constructors, or even entire design patterns to help solve this. In FP, there are closures, that can solve this exact problem for you: the dependency -e.g. a timekeeper- is captured in the closure.

Dependency injection is, by no means, limited to `do_the_thing(variable, depency1, depency2, dependency3)`.

I see this argument too often used to counter the idea of DI, and it is silly: it shows above all that the person countering the idea has little experience with all the surrounding concepts to support DI.

And that brings me back to 1: There's so much more that can be DI-d: objects can be instantiated with dependencies passed in, factories can do this. There are Actors, Workers, Decorators, Factories. Hell even a superglobal `config.get_timekeeper()` might work in some situations.


The function you pass in can keep track of how many times it’s been called, and provide different values on each call.


This is way way worse than passing in a timer and destroys your ability to write multi threaded code (among other defects).


This is for unit testing. Even if your unit tests run in parallel, each test case will have it's own mocked clock.

This is not even an experimental approach, the mocking of timers is literally one of the canonic examples of the advantages of dependency injection. I have written this exact code multiple times, on my own, or refactoring others people untestable code to this pattern to be able to test them. A couple of times the reason for the refactoring was specifically to demonstrate the existence of race conditions, to have a test case that has the race condition be deterministic so we could fix it and be sure of the fix working.


You can extract the timeout functionality into its own structure then and mock it out during testing.

There's usually a solution of some sort, it might just require a bit of rearranging and extraction of concerns.


That’s basically dependency injection


Yes, that's my point - they're giving an example for why "you can't do it", and I'm saying even for the example they just gave you can.

Create a Timeout class and subscribe to it for a given interval with a callback to what needs to be run. Pass it in as a dependency which can be mocked during testing.

Not sure why the downvotes...


>The best approach is separating the calling of the external dependency (getting the time) to it's own function

That is the FP maximalist approach and many of us reject it.

So no, it is not the best approach.

We use a time provider interface and mock it, and have zero problems with this approach.


The FP maximalist is writing your code in a monad that implements a timer. It looks quite similar to the OOP maximalist people are pushing here.


So, "yes, we're writing the same thing, but because monads I'm still pure and you're not"?


Kinda yes. And technically, it's still pure.

It does fit well a small set of problems and break in surreal nightmarish ways for everything else; just like the OOP's maximalist DI.

It is worse in that you are injecting an entire interpreter instead of just a few functions, but this doesn't create as many problems on practice as it looks like it should. And it's better because idiomatically the injection tends to be explicit and well defined; and if your DI doesn't have at least one of those, it will again break in surreal nightmarish ways.

(And here I have to point out that in OOP-land, web frameworks are allergic to well defined interfaces - what means that you can create one, but if you insist the framework tends to choke and die. So, if you are writing for the web in OOP, there's actually only one option.)

But well, it's not surprising that the maximalist options all break in similar ways.


I used to think this way, but now I believe that any method requiring writing code in a different way than the more natural one for the sake of tests is suboptimal.

People generally call time functions, and a good testing setup should keep your code the same (or mostly the same).

Anything else makes testing less approachable, and requires extra time from developers.

The provider and mocking (or a fake if you got one) approach is probably the best you can get with most programming languages.


The unfortunate aspect is that this only works for code you control.

If you have code you need to run, but has no control over (such as a database library), then you will have issues.

For example, you want to test that an entry in a caching library is indeed expiring entries like you wanted. How do you change the library to consume a time interface, if that library doesn't expose it to you?


You don't. Testing third-party code is beyond the purview of first-party tests.

If you have first-party code that is adjacent to a third-party library in a way that the library hinders testing of that first-party code, you would substitute a (minimal) implementation that you control in for the third-party library while under test.

If the library is open source or similarly code accessible, you would fork it or coordinate control with its stakeholders to see the testing of it in that codebase (and refactor it as necessary to support that). This is where the library needs to ensure it does what the documentation claims. It is not the concern of your application.

Otherwise, if the code is truly beyond your reach, you just have to trust that the vendor's claims are true. If you cannot establish that trust, use a different library that you can trust. Testing concerns are the least of your problems when you have no trust for the library you are using.


> You don't. Testing third-party code is beyond the purview of first-party tests.

Testing third-party code is often well within the purview of first-party integration testing. Aside from validating your code, and your understanding of the documentation, these sometimes incidentally catch third-party bugs. Working around bugs until upstream fixes their code may also be within the purview of first-party code... and perhaps catching regressions even if they've supposedly been fixed.

> Otherwise, if the code is truly beyond your reach, you just have to trust that the vendor's claims are true. If you cannot establish that trust, use a different library that you can trust. Testing concerns are the least of your problems when you have no trust for the library you are using.

We unit test our code because we can't even trust ourselves - nevermind our coworkers, nevermind third party vendors. Granted, unit tests for third party libraries are often worth upstreaming, and you may trust upstream enough to delete your downstream copies of those unit tests once you do...


> Testing third-party code is often well within the purview of first-party integration testing.

Emphasis mine. Exactly? Not unit testing as OP is about and GP claimed (rightly imo) doesn't cover it.

Unless we mean offline integration testing as a mid-tier before system testing; in which case I'd probably mock third party services (and others of my own not under test) personally.


Assuming 'integration test' is being said under a common definition, and not one being made up on the spot, the integration points don't expose third-party code in way that you could even begin to explicitly test them. it is fundamentally impossible. If it were possible, you would have no reason to be writing software – the third-party already did the work for you.


I'm not making up anything on the spot, it's massively overloaded/differently used by different people.

At it's 'lowest' level it's the integration of multiple units: basically anything that covers multiple functions.

The line's also drawn (with the above still being a 'unit test') at the API: so if you exercise your handler code like `views.Widget.post(...)` then it's unit; if you do `requests.post("/api/widget")` it's integration.

And at it's 'highest' level it's used to mean system or end to end testing (depending which of those you you use to mean automated testing of a deployed system).


> At it's 'lowest' level it's the integration of multiple units: basically anything that covers multiple functions.

So, testing...?

   test_unit_1 {
      assert(foo())
   }

   test_unit_2 {
      assert(bar())
   }
That's not a use of 'integration' that anyone would ever find useful. If they do, they must not be software developers. 'Integration' adds nothing. But, if this is the definition you have made up on the spot, then, yes, you could use this to explicitly test a third-party library. It is no surprise that you can explicitly test a third-party library using testing.

   test_unit_1 {
      assert(third_party_library.foo())
   }
But even then, it is not clear why you would meld that into your application's project and not its own project? It has absolutely nothing to do with your first-party application. Logically, that kind of test is best kept in the third-party library's own project.

--

Perhaps you misspoke and meant a single unit that covers multiple functions?

   test_unit_1 {
      write()     
      assert(read())
   }
But that is also just testing... As soon as you have internal mutable state (every program that does something; not even the FP diehards are able to completely avoid internal mutable state), you are always going to have to call at least two functions – one to mutate the state, another to observe the state afterwards. Here too, 'integration' adds nothing. There is no practical situation where you would ever communicate this as being distinct from the above. If this is it, it is also clearly made up on the spot.

--

The remaining two I guess say something, albeit flimsily, but obviously do not allow explicit testing of third-party code. Take the second case, since it provides some concrete code. Show us how you would update that code to explicitly test some third-party library. Remember that you said it calls "your handler code". I will wait.


Regarding the first, yes I thought obviously I meant per single unit resp. integration test. And 'covers' meaning what is actually under test.

So:

    test_1 {
        result = waz()
        assert result == 42
    }
is an 'integration test' vs. a 'unit test', according to people using the first definition, exactly when `waz` calls other functions vs. is somewhat pure.

I do not consider that I 'misspoke', but I hope that clarifies it for you.

> The remaining two I guess say something, albeit flimsily, but obviously do not allow explicit testing of third-party code. Take the second case, since it provides some concrete code. Show us how you would update that code to explicitly test some third-party library. Remember that you said it calls "your handler code". I will wait.

Don't wait, because I've no interest in continuing this.


> is an 'integration test' vs. a 'unit test'

I don't get it. There isn't a contention between the two. The definition of unit test is a test that is independent of other tests. That would include any integration test that exists independent of other tests.

Frankly, as far as I can see all tests written in any modern codebase are unit tests. It is not the 1970s anymore. We learned our lessons. Realistically, while the 'unit test' branding may have helped us learn those lessons, it doesn't add anything today. In the modern lexicon, 'test' already implies 'unit test'. They are one in the same.

> I hope that clarifies it for you.

Nope. I honestly have absolutely no idea what you are trying to say there. Sorry.

> Don't wait, because I've no interest in continuing this.

I expected. I wouldn't have interest in trying to show something that is impossible either.


> Testing third-party code is often well within the purview of first-party integration testing.

Not really. While you may end up testing third-party code incidentally in the process of you testing your first-party code, explicit tests directed at third-party code necessarily requires testing implementation details, and testing implementation details is how you get constantly breaking tests every time you try to change something and developers giving up on testing in frustration.

The value proposition of testing is that it documents the user interface in a way that is independent of implementation, allowing you to continue to iterate on the implementation while having assurances that, no matter what you do under the hood, the user experience does not deviate from what is documented.


> The value proposition of testing is that it documents [...]

No, the value proposition of testing is that it identifies incorrect system behavior before it manifests in live use. Documentation of anything is not the central value proposition of testing.


> the value proposition of testing is that it identifies incorrect system behavior before it manifests in live use.

Incorrect with respect to the documentation, yes. That's what I just said.

Tests are not intended to find undocumented incorrect behaviour. This is provable by taking it to the logical extreme of a codebase that is completely undocumented (no tests). When no tests run, no incorrect behaviour will be uncovered by it.

Only documented cases can reveal instances where the code does not conform to what is documented.


My experience has been that the documentation value of a good test suite is its primary value, by a mile.

It’s documentation a machine can validate, which is the best kind of documentation.

(Static types are documentation a machine can validate, that also happen to be very good communication tools)


I would urge anyone to read this book on unit testing, which is reviewed in this Gist.

https://gist.github.com/gniemann/adaf12895c22eb5c11c0591f8cb...


Why cant you test third-party code the same as you ideally test first party, by the interface and spec, as black box? Did that, and most often then just happens implicitly anyway in integration testing? Because your code uses third-party as specced? Confused.


Third-party code hidden in a black box cannot be within the purview of your first-party tests. There is no way for those tests to know that the third-party code exists. It's hidden away in a black box.

If a third-party library within the black box is executed during the execution of a test, you may incidentally discovery cases where the library produces something not conformant with your documentation, but that is in no way explicit.


I have the feeling that this is a very high theoretical ebony tower view.

E.g., I have a SWC that stores and reads something, and that similarly uses another SWC to do that.

If I do an integration test to write and then read something, I test very explicitly at the same time that my SWC does what it specs and is expected, as well as the lower layer SWC. (And at the same time this is also then a good regression finder for when you replace the black box you maybe shouldn't know about).

However, similarly I can also write more explicit tests that test the lower SWC in isolation and cover even more cases I cannot cover by just integration testing.

Arguing with purview in testing, and what should be done or not, never helped me so far in my career to end up with good problem and regression finding tests, and provide well working and robust systems. The contrary, discussions and nit-picking at that level are usually a huge red flag. Nice in theory, failed in practice.


I don’t see this as theoretical at all, it’s a pragmatic requirement to not explicitly test the third party tools you depend on. I’d never personally write a test which, in its entirety, calls a function owned by a third party and checks its return value and include that in my CI process.

What would I do?

- Write a unit test for the code that depends on this function, including the function call. I don’t need to test the third party behaviour specifically because I’m really interested in the behaviour of the entire code under test. If the code is broken, my own code shouldn’t work either.

- Wrap and inject fakes for the contracted behaviour. This is usually for things where the behaviour is non-deterministic, say because of a network call. I don’t want CI failing because a third party temporarily fails. I’ve worked under these conditions before and it's really unproductive. Yes, I write tests for failure cases (network interruptions, for example).

- I may encapsulate the database in my “unit,” depending on the application. The frameworks I like to use tend to integrate error handling for the DB, so I’m relaxed about network errors here.


You are then testing other peoples software, which is probably not what you want.

In integration testing I see no issue in using a real instance of whatever dependency but I would never assert on the inner workings of that dependency - that is within the testing scope for the people that did that dependency, not you.


> You are then testing other peoples software, which is probably not what you want.

And if it is what you want, why would you clutter an unrelated application's codebase with those tests? They are tests that apply to many projects. Ideally they would go alongside the work of the third-party, but failing that, surely the are better found in a new project focused on those tests?


If you find yourself explicitly testing a third-party library, you are most definitely not writing an integration test by any common definition (maybe you randomly created your own?). Third-party libraries are not exposed at the integration points.

If they were, what are you writing software for? The work would already done by that third-party, leaving your efforts to be entirely pointless!

You may incidentally test a third-party library when you test your integrations, but that's something else entirely, as has already been discussed.


I'm confused by this line of reasoning.

The point of the black box is that you have clearly defined the expected input and output and don't care about the implementation details of the thing sitting inside the box.

You can test both what you pass in to the black box and what you get out of it. And it's incredibly useful and important to do so. If you aren't testing third party libraries (or other dependencies) in this way, you're opening yourself up to a lot of unnecessary risk.

Those assumptions about your expected output are always there, whether you define and test them or not. So if for some reason the library changes its output unexpectedly and you aren't explicitly testing your assumptions, it can easily happen that you won't find out until it manifests in a production bug.


> I'm confused by this line of reasoning.

That's because you haven't read the discussion. Why would you expect anything else? Your comment straight up reiterates what was already said. For example,

"The value proposition of testing is that it documents the user interface in a way that is independent of implementation, allowing you to continue to iterate on the implementation while having assurances that, no matter what you do under the hood, the user experience does not deviate from what is documented."

Explicitly testing third-party code introduces testing of implementation details. It is impossible to explicitly test another codebase without knowing of its existence. But it is not an area of concern for a first-party application to worry about. The third-party should already have those tests in the third-party codebase.


If a library changes its output unexpectedly, then either:

- It’s a local library so the tests covering your code should fail

- It’s changed in a way you didn’t anticipate so why would testing it directly fail when your prior test didn’t?

- Your end user monitoring should light up like a Christmas tree


What if we switch to the purview of first-party tests of third-party code? Feels like you’re pushing on definitions here instead of matters.

Obviously, you can black box test third party code. It’s absurd to say otherwise.


Sure you can. But why would you test software you didn't write or own?


> Obviously, you can black box test third party code. It’s absurd to say otherwise.

No. You cannot explicitly test code within a black box. How could you? You wouldn't even know it is there. Is it that you don't understand what 'black box' means?

If you remove the explicit qualifier, then okay, but if you do that you would stupidly be having a completely different conversation to the one that is taking place here. The explicit qualifier was explicitly qualified.


> Not really.

Yes really.

> While you may end up testing third-party code incidentally in the process of you testing your first-party code, explicit tests directed at third-party code necessarily requires testing implementation details, and testing implementation details is how you get constantly breaking tests every time you try to change something and developers giving up on testing in frustration.

Depends. I'm certainly not advocating testing implementation details - 3rd party or 1st party - that your codebase doesn't rely upon. That is brittle as you say. But I do advocate for being willing to test anything you rely on - be that documented, hinted at, undocumented, or even explicitly warned against assuming if for some horrible reason you have a terrible need to make such assumptions anyways.

1st party or 3rd party.

When so written, if the tests are brittle, the codebase is brittle, and the tests correctly identify that needs to be fixed. Deleting or not writing the tests won't fix the problem - it'll merely shove the problem into production.

Amelioration might include carefully controlling and vetting updates, switching libraries, writing your own, rewriting your code to make fewer assumptions... or perhaps just yeeting the entire feature relying upon it out of your codebase, if you're desperate enough.

> The value proposition of testing is that it documents the user interface in a way that is independent of implementation

That is a value proposition, but not the only one. Others include simplifying debugging when you break things, when other people break things, and catching bugs before they go live instead of after (even if it'd be trivial to fix when someone belatedly notices.) Fuzzing-generated regression tests are often unreadable garbage when it comes to the purpouses of "documentation", a cargo cult of redundancies and red herrings and voodoo that, once, caused a crash.

Another value proposition - and I do find value in this in bugs caught in my code, their code, and their documentation - is "documenting" my understanding of upstream documentation, for which upstream tests alone will be useless. After all, that verified someone else's understanding, not mine. And it turns out this is important, because the documentation is outdated, the documentation lies, the documentation is insufficient, the documentation didn't consider that edge case, and the documentation foolishly presupposes common sense. Anyone telling you otherwise has a bridge to sell.

Even worse: the documentation may be technically correct... but misleading. Can't even blame the author - it made sense to them!

> allowing you to continue to iterate on the implementation while having assurances that, no matter what you do under the hood, the user experience does not deviate from what is documented.

Such iteration might include updating third party dependencies. I do this frequently. Poor test coverage means heisenbugs and fear around updates. Good test coverage might explicitly compare two different backends, or different versions of the same backend - 100% implementation details - and ensure the publicly visible behavior of my own APIs remains identical when switching between them. This means knowing which versions of which third party libraries to forbid or write workarounds for. Such tests should make no stupid assumptions, but should absolutely test for sane assumptions.

Such iteration might include upgrading compilers. I do this frequently. We've had unit tests catch codegen bugs. This is good.


> Depends. I'm certainly not advocating testing implementation details

Then it is otherwise impossible to explicitly test a third-party dependency. As soon as you state an explicit relationship, then the tests become tightly coupled to that dependency. If down the road you switch out that library for a different one, your tests will break as they explicitly reference that library. This is not where you want to be.

Well written tests will have no concern for what libraries you use. As before, they will only test the outside user interface. How that interface functions under the hood does not matter. As long as the user interface conforms to what you have documented, who cares what libraries you have used? The fact of the matter is that nobody cares about how it is implemented, they only care about the result.

> be that documented, hinted at, undocumented

Only the documented. Anything undocumented will never be considered. As stated in another comment, this is provable by taking it to the logical extreme and assume a program is completely undocumented (no tests). When no tests run, nothing will be tested.

Only the interfaces that which you have documented will be validated when you run your test suite, and they will only be validated for what you have documented as being true.


> Testing third-party code is often well within the purview of first-party integration testing.

Yes, integration tests do tend to cover integration with third party libraries and even entire products (such as databases). But even in integration tests, the third party code is incidental.

For instance, when created integration tests for databases, it's very common to use an embedded or in-memory DB such as SQLite or H2. You want to test integration with third-party modules when it is possible and "cheap", but the highest priorities are testing first-party module integration and having tests than can run fast without requiring a full-fledged replica of our production environment.

If I come back to the GP statement "this only works for code you control", they either meant that you can't use this this techniques unit tests of third-party code or that you can't use this technique in integration tests of third-party code. Either way, it doesn't make sense. You cannot and should not unit test code you don't own, and you're not supposed use mocks (like a mocked clock) in integration tests.


>Yes, integration tests do tend to cover integration with third party libraries and even entire products (such as databases). But even in integration tests, the third party code is incidental.

It definitely shouldnt be. At least, not if you want your tests to tell you when upgrading a 3rd party dependency will break something.

>For instance, when created integration tests for databases, it's very common to use an embedded or in-memory DB

I worked on a project that did this once and they very quickly got blocked writing a test by the in memory database not supporting a feature postgres had.

Result? "Oh well, I guess we dont write an automated test for that." Manual QA's problem now. That's slow.

Realism matters. Sacrificing realism for speed often means you will get neither.

>you're not supposed use mocks (like a mocked clock)

I do do this and it usually works well. Why am I wrong?


>>I worked on a project that did this once and they very quickly got blocked writing a test by the in memory database not supporting a feature postgres had.

>Realism matters

It is often just a small subset of tests which have to be very expensive to build, maintain and run because of an external dependency that cannot be mocked. Mocks work for 95% of use cases and should be used even if it is not 100%.


I never found that to be true and moreover, these scenarios with nonstandard features tended to be the scenarios I was most worried about breaking.

There is also the problem of the in memory database behaving differently when given the same SQL so your test might make it look like everything works while a bug crops up in production.

Realism matters.


> There is also the problem of the in memory database behaving differently when given the same SQL so your test might make it look like everything works while a bug crops up in production.

You've horribly messed up the architecture of your application if that is a problem. (Or you've misunderstood previous comments)


Postgres and sqlite/in memory dbs just behave differently to each other sometimes. Knowing this fact doesn't mean you've messed up your architecture it means that you have some understanding of how these databases work.

Realism matters.


I'd say that in-memory/not-in-memory isn't the big difference - it's whether your database is in-process or not. Even with just a database running on the same node, but in a different process, connected to via unix socket, the context switches alone lead to very different performance characteristics. Actually going over the network obviously changes more. It's very easy to miss antipatterns like N+1 queries when you test on sqlite but run on a shared database in prod.


This is irrelevant for unit tests. Performance testing does not make any sense in build environment, you need to do it in an environment close to production and that’s completely different test automation scope.


You can see this stuff often even in test workloads. But even if you disregard that kind of issue, you still have stuff like needing to integrate networked database connections into e.g. event loops, which you don't really need to do for things like sqlite.


Unless you are working directly with something like reading the signals off the clock on the motherboard you are bound to deal with third-party code when dealing with clocks.

What's the point of your home-made timer if it doesn't faithfully emulate the actual clock that's going to be used in your actual program? How will you have confidence that your tests are even doing anything useful?

There are plenty of things that might be out of your control. If you try to emulate time. For example, the system, when using a realistic timer may decide to do some context switches, seeing how your code is blocking waiting on some wakeup event, while if you test against a timer that tries to speed things up those context switches won't happen. And there could be plenty examples of similar things.

If not, then how are you going to implement your home-made timer? Is your home-made timer going to be an actual timer (i.e. it will have to read time, thus relying on third-party code to do it) or will it pretend that time doesn't exist (and thus test something different from what your code is actually supposed to do)?

Your ideas about "substituting" system clock with hand-written code are out of reach for most developers out there. Even those who theoretically could embark on such a task probably won't because that's just too much effort.

What I'm getting at: the whole idea of mocking time for testing is a fool's errand. You'll be testing things that you don't actually need and won't be testing that actually need testing.


Huh? I take it that you pressed the wrong reply button. Perhaps you meant to reply to this [https://news.ycombinator.com/item?id=38704899] comment?


I recently came across libfaketime. It intercepts time related system calls, so you can set a time of your choosing, starting from program startup.

$ faketime 'last friday 5 pm' sh -c "/bin/date; sleep 5; /bin/date" Fri 15 Dec 2023 17:00:00 GMT Fri 15 Dec 2023 17:00:05 GMT


> For example, you want to test that an entry in a caching library is indeed expiring entries like you wanted. How do you change the library to consume a time interface, if that library doesn't expose it to you?

Intercepting/hooking system APIs might be an option in some cases. You might even be able to use something someone else wrote. Some prior art for date/time stuff specifically (which I haven't used and thus can't vouch for:)

https://manpages.ubuntu.com/manpages/trusty/man1/datefudge.1...

https://www.nirsoft.net/utils/run_as_date.html

More dynamic / finer grained hooking (e.g. perhaps targeting a single thread, within a function's scope, because you're not writing a command line tool that quickly exits) is likely to require a more manual touch.


In C++, you normally don't -- there are low-level function hooking calls or ptrace api, but those are too complex to be used in unit tests.

The approaches I have seen used in practical projects is:

Ignore unit testing of time-related parts, rely on integration tests.

Have unit tests be super slow so that actual system clock can be used. Get occasional false positive failures when your build system scheduled unit tests in parallel with particularly complex compilation tasks.

Wrap everything in your own wrapper, providing "production" and "testing" version of class, which "production" calling into the 3rd party library while "testing" calls into your implementation that uses debug time. This is non-trivial as "testing" implementation might be quite complex, but may be worth it for some common functionality like timers.


I agree. I first started doing this (instead of the often recommended global monkey patch) when I started programming in Go.

In our main application we pass in a function that returns `time.Now()`, and in tests we pass in a function that returns whatever time object we want.


Then you get too many dependencies and the code is hard to work with because you in turn have too add so many dependencies. This is a hard problem in large programs.


> Time is a dependency like any other

Oh no... not at all. Here are some examples of when your approach turns out to be worthless:

* Performance testing, eg. looking for memory leaks. It doesn't matter how your timer ticks, what matters is that over time the system performance degrades.

* Any kind of hardware testing that relies on sensory input that is time-gated.

* Even "simple" stuff like testing software that does eg. desktop video recording. Still, you cannot substitute the wall clock with whatever timer you will come up with. I mean, in all those cases you "sort of" can, but that will render your tests worthless.

* Of course your home-made timers will prove useless if the tested software has third-party components that don't care about your timers.

Of course some cases can be handled by what you described, but, in my experience, when code works with time, the tests need to concentrate on special cases and those are usually hard to impossible to emulate through home-made timers. Or, they would benefit tremendously from time being real rather than a "surrogate".


None of the scenarios you listed above are unit tests, which I believe is what the OP and the post was referring to.


Sometimes the simplest things are so hard for people to come up with. Unfortunately, people are trying to make their life more difficult than it needs to be -- I think developers are especially prone to trying to figure out their own solution rather than spending a moment to search for it.

All contact with outside world should happen through interfaces with replaceable implementation.

If you care about time related stuff, this should also include time, timers, schedulers, etc.

Make sure you don't circumvent the interfaces by accident. There are stupid ways to do so like getting current time in SQL or embedding current time in new objects by default.


While I generally don't think tests are bad in general this is a wonderful example of the fundamental problem with software development. Essentially, you have a tool that was supposed to make development easier (tests) eventually change the code to become more complicated to make it easier to use that tool with, which makes development harder.


Injecting a Clock or Timer is great not just for automated tests but also for debugging weird behavior. Theres really hardly any downsides.


I'm not sure how that would work. Can you give an example?


Have a look at this method on Java's Instant class for an example.

https://docs.oracle.com/javase/8/docs/api/java/time/Instant....

Code under test creates Instant, so long as it uses the Clock you've passed in, you'll control how it determines when now() is.

You've got several built-in testing clocks like this one[1], but can also implement your own.

[1]: https://docs.oracle.com/javase/8/docs/api/java/time/Clock.ht...


Just a class (or struct or module or whatever your language has for that) that has a “now()” method which returns the current time.

And then replace all your calls that do Date.now() with clock.now() and then you can make “now” be whatever you want.


> If you can live with the loss of precision from Approach 5, pass time stamps,

I was confused why a passed-in timestamp would have less granularity than the return value of gettime() (or whatever) directly. The issue isn’t a loss of /precision/, but a loss of /accuracy/ due to added latency between observing the time outside the function and making the decision within the function.


Interesting observation: my first language is Polish, and we'd use the same word for accuracy and precision ("dokładność").


Really? Precision translates to (unsurprisingly) "precyzja", and in the specific context of calculations and measurements, the distinction is well established, see eg.:

* https://centrumdruku3d.pl/jak-zrozumiec-roznice-pomiedzy-dok...

* https://automatykab2b.pl/technika/54301-dokladnosc-i-precyzj...

* https://learn.microsoft.com/pl-pl/office/troubleshoot/access...

* https://pl.wikipedia.org/wiki/Dok%C5%82adno%C5%9B%C4%87_i_pr...

etc.


That's very interesting! In English, we differentiate them because they are actually two different things, at least in our mind. I wonder if there's another Polish word that might differentiate?

Here's an image that explains the difference pretty succintly: https://www.antarcticglaciers.org/wp-content/uploads/2013/11...


But we absolutely do differentiate them in Polish. Accuracy means dokładność whereas precision translates to precyzja.


same with efficiency and effectiveness, a lot of ppl just don't know the difference


plus efficacy :-)


TIL (I'm not native EN)

> efficacy /ĕf′ĭ-kə-sē/

> Power or capacity to produce a desired effect; effectiveness.


As a Ruby/Rails developer, time mocking is awesome. It's built right into rails or you can use Timecop and it's just as simple as adding:

    around { |example| Timecop.freeze("2023-12-20T12:00:00Z".to_datetime) { ex.run } }
or

    around { |example| travel_to("2023-12-20T12:00:00Z".to_datetime) { ex.run } }
as setup hooks to your tests and it stubs all of the relevant datetime helpers for you for each test case and properly restores everything between tests.


I recently wrote some logic that's supposed to run every minute, and look at the data from the last minute. Normally I would've ended up doing queries ala event_time BETWEEN now() and now() - 1m.

However, one constraint was that I needed to also run this on existing data. Which led to me instead passing in time. A small thing, but suddenly this logic beame so incredibly testable. I can call it 5 times with 5 different times and see that it behaves as expected, or simulate it over a whole day of test data. Even tweak the code a bit, delete the data it generates in prod, and re-run it with the new ruleset. So flexible.

Such a small thing, but without the additional constraint I probably wouldn't have arrived at this design.


In modern Java it’s pretty common to inject an application-level clock object in all components depending on time. Writing an unit test becomes easy, because you have full control over what time is returned on each call.


The book "Test Driven: Practical TDD and Acceptance TDD for Java Developers" (Lasse Koskela, ed. Manning) includes a chapter titled "Test-driving the unpredictable", with several techniques for testing time-based functionality. Highly recommended.


Im a bit surprised that this (very C++ biased) post gets such traction on hacker news.

Isn't this a solved problem viz moving side effects like IO (getting the time) out of the logic parts and just unit/property testing those? Am I missing something here?

EDIT: added last sentence


> Isn't this a solved problem viz moving side effects like IO (getting the time) out of the logic parts and just unit/property testing those?

Yep, and it delivers even more than what was asked, because now you can test "code that depends on X", not just time.

Why it hasn't gained much traction is the camp of coders who will "want to make it simpler" by removing such indirection and making the code ask for the time directly.


* in C++.

On Python, just use freezegun to inject controllable timestamps in response to calls to time methods.

https://github.com/spulec/freezegun


The Ruby equivalent is Timecop

https://github.com/travisjeffery/timecop

Dynamic languages have the advantage to be able to rewrite the standard library classes at runtime.


I used to change an objects class itself in runtime in JS by changing its __proto__ property.

I don’t think that works anymore, but at the time I could create plain value objects and later make them class instantiations (on a small system way back when, this was a nice speed up due to a now uncommon pattern of having data first and waiting for the classes to load).


Other languages can use LD_PRELOAD (or DLL injection on Windows) as long as the binary isn't statically linked.

See other comments in the thread about libfaketime.


I read the documentation about libfaketime and I saw that it doesn't work in combination with other libraries that also play with LD_PRELOAD. Timecop could fail if something else also wraps the date and time classes but its block scoped form should keep working. I didn't check. Furthermore I don't know what's out there that does it. I should write my own code.


Freezegun is pretty much unmaintained. You want time-machine https://github.com/adamchainz/time-machine


Time is state. State is passed in at the boundaries. End of story.

Anything which relies on time coming in as an implicit variable as internal state is doomed to have legions of undebuggable problems.


Ok but what I’m really interested in is integration tests that depend on time.

Waiting for a set amount of time (i.e. literally calling to wait(5) in the test code) is slow and, even worse than that, flaky. If the test server is busy it might take it longer than 5 seconds to do what the test needs to happen.

The best approach I have found so far is “pulling for the side effects” (i.e. if the test is awaiting the creation of a flobber object, keep pulling the list of flobber objects until a new one pops up). This is not always possible (the interface for pulling the flobber objects is not always available) and it is also inconsistent with multiple threads/processes(one thread might have the new flobber and tell your test that everything is ok, but the next step in the test you happen to be served by a different thread which doesn’t have the new flobber yet).


A multi-tenant timeline (i.e. a tenant local clock) with a side channel for the testing harness to modify the time is what I found most practical when doing integration tests, and the test itself depends on time. (Edit: when testing Java servers)


My approaches:

1. libfaketime

2. Make the test resilient to changing time

3. Dependency inject at an even higher level


It's a shame that the recommended solution is using a C++ trick of templates that doesn't generalize well to other languages.


The issue doesn't generalize across languages because they often have other mechanisms that are simpler. In dynamic languages with modules like Javascript, Ruby, or Python, you can usually monkey-patch the system time methods to return what you want. In Java and other JVM languages you'd use dependency injection and then inject a mock into the test. Most other languages use some variant of Approach 1 or Approach 5.


When possible pass the time as an argument and create pure functions or methods.

Reserve Time.now or its equavlent to jobs, controllers, or what not at the edge of your application.

Watch out when using NOW() in your db. Not that you shouldn’t but it might be surprising when you do.


>> Watch out when using NOW() in your db.

Would you be so kind as to elaborate on this?

It was always my thought that NOW() was the more sensible approach as it would capture time in the execution order from a more centralized source (the db server). Where having the origin of a transaction setting at time would get me when, on the origin server the query was initiated.


Not the person to whom you're replying, but I suspect the problem isn't that the database is using NOW(), but that your code being tested will be using an entirely different time.


Bingo.

There is nothing wrong with using NOW(), in of itself, and it maybe the best thing to do.

But it may lead to surprising results when say you have an application that has mocked out time for a specific time, but you get db results for the current time. It introduces another clock into your system.

I have found most of the time we were using NOW() we could pass the time into the query and create a more understandable chain of code.


You mostly want to be consistent. For example if you do a job scheduler which uses NOW() to create job, but a local clock to execute them, then any clock difference between machines will be causing hard-to-debug problems.


If you can't test your code because you use `NOW()` or another form of automatic timestamping in your records, it is on your language or framework. You shouldn't forgo such niceties for limitations imposed by your framework or language, demand better.


We use a "service" class with an internal timer that returns ".now()" and that implements an interface. The .ctor initialize the service's date with Datetime.now().

We use dependency injection wherever the service is required.

With this, we can manage 2 things

1. The service can go faster or slower than the real time by changing the timer's interval so we can simulate time depend behaviour in the app while testing.

2. We can mock the service when testing with a simple counter that's increased each time “.now()" is called.


well, in the latest .NET 8, for any of the .NET languages there is

https://learn.microsoft.com/en-us/dotnet/core/whats-new/dotn...

Though most people were rolling their own version of it before this.


I do not think Approach 2 is sound. It proposes specialization of a template without needing to recompile the library source, implying that the specialization will be done only in a separate translation unit.

The standard §13.9.4 says:

> If a template [...] is explicitly specialized, a declaration of that specialization shall be reachable from every use of that specialization that would cause an implicit instantiation to take place

What's going to happen in practice is the compiler will instantiate and inline the unspecialized template in the library, and the program will silently not use the mock time. Or not, it's entirely unconstrained.


The easiest way to do this (in my experience) is libfaketime. I'm surprised not to see it mentioned.

https://github.com/wolfcw/libfaketime


Why the surprise? There is too much to know for anyone to know everything.


I don't think it's that easy for the typical unit test though?

Most of the time usage need to advance time in controlled fashion. So you'll need to create a shell wrapper that runs faketime with correct FAKETIME_TIMESTAMP_FILE value, hook it up to your build system, create C++ wrapper to write to this file... If you are controlling source code, adding a "time" argument might be easier to implement and easier to understand.


libfaketime's awkward interface (control via timestamp files, needing LD_PRELOAD) is why I wrote an experimental library[0] which takes over the clock at the process level.

My library modifies the code for the time-related vDSO functions, which is incredibly sketchy yet effective. There are also Python bindings at [1].

[0]: https://github.com/DavidVentura/tpom

[1]: https://github.com/DavidVentura/py-tpom


Wouldn't the FAKETIME environment variables be easy for typical unit testing?


In Go, I would normally have my own time function which would return the std lib time function, which works flawlessly for testing my own code. The problem is if there is foreign code(ie. 3rd library dependency) that uses time and which makes obviously calls to the std library time, making it untestable. The only solution is to overwrite time on language level by some trickery(not all languages allow this) or literally change the OS's clock and test manually one test at a time. This is one of those unsolved programming issues.


This made me realise, that despite the many complaints I have about jest and the node ecosystem generally, we’re actually very lucky that it has so many convenient affordances like modern timers.


"What do we say to the God of Unit Testing? — Not today()!"

Stub out the calls responsible for reading system clock, then set it deterministically, and advance it programmatically where a test depends on the passage of time. This works really well, reads really well in the source, and allows to test general cases and edge cases equally easily.


In the .NET world, the TimeProvider class, is in the .NET framework from version 8 onwards. (1)

And a "wall clock" or "test clock" injected using DI as needed. (equivalent to option 1 + 4)

Write your own "test clock" or use the supplied fake: (2)

In earlier versions, we rolled our own "clock" interface.

1) https://learn.microsoft.com/en-us/dotnet/api/system.timeprov...

2) https://learn.microsoft.com/en-us/dotnet/api/microsoft.exten...


Jest has pretty robust time mocking, it's a breeze to do time related testing if you're writing JS/TS: https://jestjs.io/docs/jest-object#fake-timers


A big blind spot in Jest (and I assume in Vitest although I have no experience) are the timezones. It’s very tricky to run the tests in other TZ’s than what Node has been started with.


this blogpost is specific for C++ -- in most dynamic languages like Python or JS, the default approach is to redefine time functions from standard libraries.

This would correspond to option 3 in the blog post, except instead of clock factory you redefine individual methods.


Don’t do something silly like take a parameter called current_time. Instead, name it what it’s actually used for or doing. This could be send_emails_on or the like. Parameters should tell the reader what not why.


`send_emails_on` suggests strongly some scheduling capability. `current_time` says exactly what it is needed for.


>`send_emails_on` suggests strongly some scheduling capability

Yes.

>`current_time` says exactly what it is needed for.

Not always.


The code example in approach 2 looks ill-formed though. If you change line 12 to auto to let compiler deduce the return type, you will see error message "explicit specialization of 'clock_impl' after instantiation".

ref: https://stackoverflow.com/questions/36997351/using-a-templat...

So not very sure how to elegantly inject our mock into the main code via the test section.


A bit disappointed by the article. It's more about how to test any external dependency using different injection techniques, not just time.

What I was hoping for was something like simulations or UIs/games. Complex user interactions like "Sarah clicked here, waited a few seconds, then clicked this, then waited for a network request to finish..."


That sounds like integration tests if the network is involved


Doesn't have to be. Could also mock that part.


While it would not be appropriate for unit testing, for more generally time related testing it annoys me greatly that while Linux has time namespaces, they only affect monotonic and boottime clocks, and not realtime clock. I understand that there are good reasons why that is the case, yet it is still something I've considered patching in myself


Usually, when I come across an article like this, I tend to glance at it and skip it, since I don't know C++. However, kudos to OpenAI; ChatCGPT shines in moments like this. The core knowledge of the article would have been bounded to C++ but, now you can just translate the article code with what you are comfortable with.


>This is better. Now the only code that differs between a unit test build and a production build is what clock_factory::get_clock() returns.

So you are effectively testing code that will never see the light in Production. That's called a Mock and some people hate mocks.

I have no problem against it taking for granted the actual lib has its own tests.


I thinks mocks are better in C++ than in dynamic languages. In C++ you need to expose a public interface for the customization point so it can be mocked. Mocks therefore test a public interface and not implementation details. Granted, if that "public interface" is not used outside of tests then it's not much different, but I think it adds the correct amount of friction for mocking.

In dynamic languages you can mock anything, even parts that are not designed to be customizable. This indeed has the effect of calcifying implementation details, as mocks don't test a public interface. I think it's harder in this environment to stay disciplined and use mocking in moderation, as the friction to only mock public interfaces is not there.

I definitely saw this in C++ vs Python mocking, mocking tends to be way overused in Python testing.


> That's called a Mock and some people hate mocks.

Don't worry to much about them.

Mocks are perfectly OK.

It is like putting a mechanical device in a jig for a durability test or the exhaust extractors that mechanics use when they run engines inside the workshop.


Mocks are okay, if they are used correctly. Often I see them over constraining code and then you cannot change it. If the code will not change delete the tests at they will never fail.


> If the code will not change delete the tests at they will never fail.

Or, unless they are expensive, just keep them. The code doesn't change so they won't break.

But for the next maintainer they will tell them a lot. One particularly nice thing about some code I maintained recently was that it allowed me to run through different scenarios in my debugger even during my first days in that code base.

Also I don't think I will ever again assume that some code that is in production will never change.


I generally assume that production code will change in the future. Thus if the tests you write today do not guide those changes to not break other existing functionality since they so constrain your implementation that those tests will need to be deleted then the tests are not valuable.

I've seen code where every dependency was mocked and all tests ended up being the implementation calls some function with some arguments. Most of the functions can be called many different ways to get the same answers, but the tests forced a specific number.


It is easy to do mocks wrong.

Here is however an example of how one does it right.

Say you have a system that accepts input, does something with it and stores it to some storage.

You don't want to touch actual storage in the build pipeline.

Then you mock in testing out the storage.

Everything can still be tested:

- the validation of incoming stuff isn't affected at all

- the transformations aren't affected

- the only difference is your tests now read the output back from a memory location instead of from a permanent storage somewhere



Recently had this problem, went with option 5 indeed, passing timestamps.

This also made it trivial to replay logs to the application and get back in the exact same state, which is great for debugging of course.


This whole "how to" is very simplistic. It shows that it was written by a programmer who never done any serious testing.

Testing code that depends on time is very complicated, and solutions aren't about whether you have factory methods or use C++ templates. That's by and large irrelevant. To the best of my knowledge there isn't any strategy that would handle all aspects of testing time-sensitive code. The goals and means are very different between different test cases.

I wouldn't even dare to try to catalogue all the possible problems with testing time-sensitive programs... this is just too big of a subject to try to cover in few paragraphs.


Time in software is fascinating. If you say time - everyone thinks he understands it perfectly as we use it in our everyday. But it has so many pitfalls in the details.


I take an optional parameter time that when unset defaults to time.now.

Makes testing trivial and operative code trivial


Some useful tips I can use, thanks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: