In my opinion formal methods are what's missing from software engineering to mak...

vasco · on June 24, 2024

Software gets used for mission critical things all the time, with the same level of assuredness. What you are complaining about is that people making apps to show you the right advertisement or a website to watch pictures of cats sometimes is buggy and gets deployed without proper tests.

Meanwhile almost every medical device under the sun today has software, nuclear powerplants run on software, space missions use software, airplanes drive themselves and land through software, etc.

ethanwillis · on June 24, 2024

It may be perceived that showing the right advertisements or showing pictures of cats isn't mission critical, and even just a toy. However, when massive amounts of personal information is being collected and shared by such applications I would say they're pretty important.

davidgrenier · on June 24, 2024

Your very last example kinda supports the thesis up there, considering how it's been going.

hwayne · on June 24, 2024

Notoriously, the 737 Max bug that killed a bunch of people was a software patch over a serious mechanical design flaw, and Boeing decided it was easier to make the software engineers fix it than redesign the entire plane.

thebruce87m · on June 24, 2024

Incredibly well, statistically?

ozim · on June 24, 2024

My opinion is that too many people think software engineering is not a serious engineering discipline and romanticize other engineering too much.

For software that really matters you have standards like MISRA C you also have professional bodies like IEEE, ANSI, you have ISO standards for software development and infrastructure, there is a lot of standards and regulations that you can follow just like in all other fields. There is NIST for security and I could probably go on and on. So it is not "overcomplicated hand-waving".

But no one is going to deal with "implementing formal approach" for "yet another crud app", they will hire whizz kid who knows latest JS framework and be done with it paying him 1/10th of cost of someone who knows more than that.

ffsm8 · on June 24, 2024

In theory, your software itself provides the proof that it does what it claims. The proof would directly link to unit tests or integration tests at the meta level.

Arguably, software engineering does this much more exhaustively then any other engineering field... To very varying quality output, from masterpieces like SQLite to your average B2B artifact which barely works despite having 99% coverage.

xigoi · on June 24, 2024

Tests are not a proof. Just like you can’t prove a mathematical statement by checking it for a million random numbers.

lucumo · on June 24, 2024

Tests are a basis for every engineering discipline. In software we have the luxury of being able to test each delivered project. Other engineered projects often only have their components tested, and then need to rely on calculations to determine the end result.

Tests proof that something can handle the exact scenario that we're testing. The forces on a software system are just different and in some ways less predictable than for e.g. a building. But in software the consequences of failure are often less catastrophic[1] and it's a lot cheaper to run a battery of tests.

[1] And if they are not less catastrophic than a collapsing building, you'd better test it far more extensively than an average system. That's engineering too, making a trade-off between the costs of failure and the cost of preventing failure.

high_na_euv · on June 24, 2024

Incorrect.

Proof by exhaustion, also known as proof by cases, proof by case analysis, complete induction or the brute force method, is a method of mathematical proof in which the statement to be proved is split into a finite number of cases or sets of equivalent cases, and where each type of case is checked to see if the proposition in question holds.

xigoi · on June 24, 2024

I know. Tests are not the same thing – usually they check only a subset of all possible cases.

high_na_euv · on June 24, 2024

Because usually people do not write critical/important software :)

xigoi · on June 24, 2024

That’s irrelevant. I was just responding to a comment saying that tests are a proof.

high_na_euv · on June 24, 2024

And they definitely can be

creshal · on June 24, 2024

> In theory, your software itself provides the proof that it does what it claims.

If you manage to solve the halting problem first.

seethedeaduu · on June 24, 2024

People keep summoning the halting problem in unrelated contexts to handwave things awayvas impossible.

coldtea · on June 24, 2024

Except in this case, where it's 100% fitting.

auggierose · on June 24, 2024

It is only fitting if you think of providing the proof as a push button technology. That makes as much sense as thinking of providing the software itself as a push button technology.

coldtea · on June 24, 2024

It has nothing to do with the proof as a push button technology.

Adding annotations to the code for example is not a workaround for the halting problem.

(Working with a subset that's not-turing complete is, but that doesn't cover the general case the parent calls for, and is not practical with current software development languages and practices).

auggierose · on June 24, 2024

Not sure if we are talking about the same thing. If you know why your software is correct, you can write that down as a proof (if you have a language which allows you to do that). If you don't know why your software is correct, well, maybe you should not ship it, because maybe it isn't correct.

So yes, it is all about push button technology here. If you are willing to put in the work to write down the proof yourself, instead of expecting a magical entity to generate it for you, then it is possible. All the halting problem says is that there is no such magical entity. You will need to know why your software is correct.

adrianN · on June 24, 2024

The halting problem is a worst case statement. For almost all functions a programmer encounters in their job it is trivial to solve.

Thiez · on June 24, 2024

Exactly. In practice you always want each loop iteration to perform some action that brings you closer to some desired end state. That loop invariant + some notion of getting closer to the end is usually the proof you need to prove termination. A loop where you don't have these things is almost certainly a bug.

The vast majority of loops iterate over some finite collection of elements and are thus guaranteed to terminate. Tricky cases are stuff like "I will repeat this operation until the output stops changing".

coldtea · on June 24, 2024

>Exactly. In practice you always want each loop iteration to perform some action that brings you closer to some desired end state. That loop invariant + some notion of getting closer to the end is usually the proof you need to prove termination.

If you're making Space Shuttle software from scratch yes. In practice for 99% of current mainstream development you depend on piles upon piles of dependencies, which might or might not check those loop invariants correctly, or might have a (trivial to add) endless blocking timeout, a deadlock, and many other wonderful things.

Usually they're not even set in stone, you'll update to the next version, and if they introduced such an issue, there's lots of fun to be had.

Nobody doubted you can have formal proofs for algorithms, or checked software in certain conditions and for certain controlled environments (e.g. using MISRA C and such) (in fact, I support this in a comment elsewhere in this thread).

The objection is to the idea that this is applicable (or trivially applicable, and we just aren't doing it out of laziness or something) to regular software development.

ulbu · on June 24, 2024

auggierose · on June 24, 2024

No, you just need to require that somebody who writes the software, also knows why the software halts, and documents the reason properly.

pyrale · on June 24, 2024

The halting problem is not an actual issue. It’s a darling of computer science classes, but is you need to prove that your program halts, it’s not that hard to write it in a language subset that is not turing-complete.

adrianN · on June 24, 2024

The calculations you do to make sure that the bridge doesn’t collapse also require that the steel matched whatever properties your formula assumes. That doesn’t make the calculations useless.

ffsm8 · on June 24, 2024

That doesn't need to be solved.

Continuing the stream of inputs holds no meaning once you've tested every possible permutation (branch coverage) beyond checking for memory leaks. Every software has a finite amount of states it can hold, most developers just erroneously expect some states to be unlikely or impossible to reach.

riku_iki · on June 24, 2024

> If you manage to solve the halting problem first.

there is a large slice of algorithms, which are possible to prove that they are not affected by halting problem. I think having tooling for this slice of algorithms is already a big win.

impossiblefork · on June 24, 2024

No, the proof is the proof that you generate with formal methods.

Is exhaustive testing even possible in theory?

ffsm8 · on June 24, 2024

Not just in theory but actually done:

https://www.sqlite.org/testing.html

impossiblefork · on June 24, 2024

I don't agree that that's exhaustive testing. Full branch test coverage etc., is normal and sensible and it's great, but that's very different from having a proof that given certain preconditions, you will have certain postconditions.

ffsm8 · on June 24, 2024

100% branch coverage means that you've fundamentally covered all potential states your software can enter.

What you're asking for is literally only possible in theory and entirely pointless in the context of reality, because you cannot guarantee that a device is available at runtime, invalidating any proof. The same applies to any engineering projects, software or otherwise

akiarie · on June 24, 2024

Not even remotely.

  if (x > 0) {
    /* code */
  } else {
    /* more code */
  }

Branch coverage just means that all the lines above are executed in some test. That's fundamentally different from dealing with all the different values that `x` could take on at run time.

ffsm8 · on June 24, 2024

You're missing the point.

Given your example no software can ever handle this equation given infinite numbers, as all CPUs have a maximum number they can compare.

Just like you cannot prove that a bridge will be able to handle a given load. Engineering fundamentally can only make assurances within expectations. Your tests define these expectations, and by covering 100% branching coverage you've exhaustively explored all valid states your software can have.

Do note I'm talking about actual 100% branching coverage here, not just 100% coverage of your own code, ignoring libraries and runtimes you've imported. That's way harder then you seem to think and the reason why sqlite (the linked page) has so much more testing code then application code

ratmice · on June 24, 2024

I agree with the other commenter that his is not even remotely true... given something like

loop { if (A) { ... } if (B) { ... } }

100% branch coverage means you've encountered both cases, but the loop could have encountered these in any number of ways specifically this branch could be encountered as either A && B both A && !B && B && !A, all of them exhaustively, and could even depend upon whether the order branches were taken matters.

While formal methods give an exhaustive proof.

SassyBird · on June 24, 2024

> 100% branch coverage means that you've fundamentally covered all potential states your software can enter.

No, it doesn’t.

  bool is_even(int n) {
      return true;
  }

  bool test_is_even() {
      return is_even(2);
  }

100% test coverage. And yet.

ffsm8 · on June 24, 2024

Are you seriously that big brain that you think that's smart?

Your example software makes the claim that integers that the given runtime/interpreter/VM can handle will be assured as true. This was tested.

Nobody said that any description of the software is always correct, just because a test was added that tests random stuff

Also note that that's not 100% coverage. It merely covers your code, which builds upon other code.

Feel free to actually read the given link. Your horizon might be expanded. Or don't, and keep your dunning kruger worldview

discreteevent · on June 24, 2024

Those who dunning kruger should be careful lest they be dunning krugered themselves!

ffsm8 · on June 24, 2024

I guess we're desperate to get to answer to the question of how dumb comment chains on this forum can get, and wherever everyone involved is either mentally or actually in the /r/im-12-and-this-is-smart phase.

atoav · on June 24, 2024

> In software engineering everything is basically overcomplicated handwaving.

No, it is worse than that. There are people who claim the proof is the code itself. Just look at it. These are the same kind of people claiming good code needs no documentation.

That is like a bridge engineer saying they need no structural calculation because the stability of the bridge is obvious to a certain kind of bridge nerd if they look at it. Only that civil engineers figured out after a few high profile bridge collapses that "trust me bro" isn't good enough and software engineering people are still in the: "trust him, he is a genius"-phase of the field.

mrkeen · on June 24, 2024

> There are people who claim the proof is the code itself.

Yet this is what emerges when you write a sufficiently detailed proof.

The more detail you add to your TLA+ model, the more it looks like just another implementation (albeit untyped, so it's pretty easy to slip errors into).

atoav · on June 24, 2024

In the end, if you want the map to represent the landscape 100% you end up with another landscape.

Granted. But engineering is about making sure we don't make mistakes and just like in math class a good way to assure that is to take two different paths and arrive at the same place.

szundi · on June 24, 2024

We should have a TLA+^2 then. Oh the complexity has to be expressed anyway? Uh Oh and users mostly rather suffer from bugs than give up complex desires and features? Uh We’re doomed haha

atoav · on June 24, 2024

Isn't that why we have modularization? In the end you can formally verify low level stuff on a per module basis. Knowing that your PNG decoder is formally verified and therefore it is less likely that a user uploads a PNG and gets Runtime Code Execution within that process context isn't worth nothing. And if it allows attackers to get RCE the number of mysterious crashes it might have produced with weird PNGs is probably also non-zero.

In the end all security measures are a layered effort and doing a bit formal verification, a bit unit testing, a bit fuzzing etc. is always better than mentally aiming for 100% and not doing anything at all.

The question regarding formal verification is: What is the code that is running in the critical domain? Someone who writes safe code will try to minimize the amount of code/executables/complexity in that space. And that domain isn't necessarily a single process, it can be a part in the code, or a process until it dropped privileges etc.

If all your code is critical domain it either means you failed to sanitize inputs, everything runs with yolo/root privileges, all of your customers share a single table in your database, your hardcoded admin password is 1234 — or maybe you write code that controls nuclear power plants, airplanes or similar things. But if it is the former, no amount of formal verification will save you anyways because wrote software like a money writes poems: By accident.

If the idea of a critical domain is new to those reading this comment, if you never thought about user proviledges when writing an application, please read this as proof of the point that the field of software engineering still lacks in all sort of places and it is in all our interest, both professional and as users/potential victims to improve the situation.

anal_reactor · on June 24, 2024

I was thinking about this, and to be honest, that's what I like about software. It needs to be a logical program, but there's also an artistic element to it. "Why did you name this variable like that? Because I was feeling like it"

cubefox · on June 24, 2024

> In all other engineering fields you have to produce calculations and "proofs" that your design ought to work.

That seems obviously false, unless you understand ""proof"" in such a lax manner that software engineering has them as well.