*Using fdr's example of a cat program, it consists of an open, a sequence of rea...

rogerbinns · on Sept 3, 2012

> OK, so I suppose the next question becomes what “test it” means.

It would be something similar to all possible code paths being executed with the program in all possible states. Heck right now I'd settle for even a small subset of this.

> Or maybe you’re thinking of some sort of automatic simulation of possible failure cases when you get to the open/read/write/close operations, serving a similar roles to things like mocks and stubs in unit testing?

That is closest to it. It is already known that open/read/write/close can fail. They are quite difficult to force to fail, so currently you have to write a lot more code to make that happen. And it gets really tedious once you look at all the combinations. eg open has to fail, open has to succeed and then read fails, open succeeds and the 3rd read fails and on and on. This is for a single ~5 line function! And then if my cat() is made available to a library, how to they test for it failing.

I'm quite happy using supervisory programs to detect rule violations. For example valgrind does an excellent job for memory allocation and usage, and I've never used helgrind but assume it works well too.

Chris_Newton · on Sept 3, 2012

It is already known that open/read/write/close can fail. They are quite difficult to force to fail, so currently you have to write a lot more code to make that happen. And it gets really tedious once you look at all the combinations. eg open has to fail, open has to succeed and then read fails, open succeeds and the 3rd read fails and on and on.

OK, I’m following you so far.

Assuming that

(a) our language lets us specify the possible failure modes for each function, and

(b) we have a test tool that can systematically simulate various possible combinations of success and failure,

what would you want to do in each test case?

Put another way, when we run our test tool and our magic replacement I/O functions simulate, say, success on the open and first two reads but then failing on the third read, what happens next? What’s the result we’re looking for to determine whether the test passes or fails?

rogerbinns · on Sept 3, 2012

At the moment the error handling code for cat would be looking at failures for all 4 functions. Unless they fail that error code doesn't even get run.

My test code currently has to put the program into a state where cat can be called, and then has to mock/augment each failure point, and then check the results. Just having the middle piece (mock/augment) done automagically would be a massive help. And then of course cat itself can fail so each of its callers needs a way of testing too. Some hand wavy combination of convention over configuration, documentation and annotations would likely help.

Even the state issue should be automatable in many circumstances. For example you could make a successful run of the program, and then the magic records state at the entrance to code blocks. It can cause each failure circumstance, rewind state to known good and cause the next failure circumstance etc.

I'd even be happy running my code (with no error handling) under some tool that induces the errors - when they aren't handled it asks me what I want to do which typically involves writing code to handle the issue, which it then directly integrates and keeps running until the next error.

Even test pass/fail can be somewhat automatable. The tool records what happens, and then in the future alerts you when there is a difference in behaviour. The response is either that the issue needs to be fixed, or that the new behaviour is correct.