> Failing tests indicate the presence of bugs, but passing tests do not promise ...

yxhuvud · on July 1, 2024

Or even worse, tests that test implementation details that doesn't matter for the actual outcome.

hyfgfh · on July 1, 2024

If I had a dollar for each frontend test that actually don't actually test anything I would be able to retire by now!

orwin · on July 1, 2024

Tests that don't test anything come in at least two categories for me:

- test that were useless, are still useless and will always be useless

- tests that are currently useless but were used in the "wtf should i write" phase of coding (templating/TDD/ whatever you want to call it).

I'm partial towards the seconds, and i like when they're not removed, because often you understand how the API/algorithm was coded thanks to them (and its often unit tests). But ideally, both should be out of a codebase.

bdjsiqoocwk · on July 1, 2024

Cargo cult testing. Some people don't understand the point of testing so they just go thru the motions and end up with something that makes no sense.

throwawaysleep · on July 1, 2024

Only needs to be management that misunderstands.

I’ve written plenty of do nothing tests in my time to be sure that management regularly got a report of tests being added.

blitzar · on July 1, 2024

"It passed all the tests so it must be working, it must be something you are doing wrong"

inopinatus · on July 1, 2024

Used to be (perhaps still is) a nasty habit of Rails apps to have vast test suites covering every Active Record query they ever used (with fixed seeds to boot), rarely straying from giving the bog-standard and already very thoroughly tested and battle-scarred AR predicate builder a wholly unneeded workout; but none of their own front-end code because writing for selenium was too hard.

But look! Thousands of tests and they all pass! Taste the quality!

yxhuvud · on July 1, 2024

> but none of their own front-end code because writing for selenium was too hard.

I've also seen plenty of tests that test if a template was rendered rather than if whatever thing it actually outputs was in the output. It is just calcifying the impementation making it hard to test.

But it is a tradeoff, and a hard one as well, because if you do all things all the time, combining all variations of database with all variations of the views, then you end up with a test suite that take forever to run. Finding the right tradeoff there has not shown itself to be very obvious, sadly.

germandiago · on July 1, 2024

One thing I do sometimes is to start part of an API in a TDD style. Everything starts very "basic", which adds a lot of relatively trivial test cases.

When done with that phase and my API looks relatively functional, I remove all relatively trivial tests and I write bigger ones, often randomized and property-based.

This works decently well and you do not have an army of useless tests there hanging after the process is done.

AlienRobot · on July 1, 2024

But if you don't test those, you may break someone's workflow!

https://xkcd.com/1172/

quectophoton · on July 1, 2024

> refactoring pickle

Been there. Change one tiny thing, and 20 tests fail all over the place. But hey, at least we had ~95% test coverage! /s

The more time some piece of code has survived in production, the more "trusted" it becomes, approaching but never reaching 100% "trust" (I can't think of a more precise word at the moment).

For tests it's similar; the longer they had remained unchanged while also proving useful (e.g. catching stuff before a merge), the more trusted they become.

So when any code changes, its "trust level" resets to zero at that point, whether it's runtime code or test code. The only exception might be if the test code reads from a list of inputs and expected outputs, and the only change is adding a new input/output to that list, without modifying the test code itself.

Tests that change too frequently can't be trusted, and chances are those tests are at the wrong level of abstraction.

That's how I see it at least.

olivierduval · on July 1, 2024

Just been beaten by a bug in a production system... hidden in code silently for more than 10 years !

It just mean that for 10 years, this codepath has not been taken (The conditions for this specific error case was not met for 10 years) :-(

Actually, it would be a good monitoring information to know which path are "hot" (almost always taken since the beginning), "warm" (from time to time) or "cold" (never executed). It could help build a targetd trust. I guess that it might be possible for VM languages (like based on JVM) because the VM could monitor this... but it might be harder for machine code

aunderscored · on July 1, 2024

This could be interesting. Unfortunately it'd be a performance hog to do. Some kinds of things do work with this (see performance guided optimisation in compilers)