Formal methods only solve half my problems

rurban · on June 4, 2022

I'm using better model checker tools, and they also only solve half of my problems. but all the tools together solve everything.

https://blog.nubix.de/index.php/2022/04/06/testing-baremetal...

deterministic · on June 4, 2022

Wow. Any solution that solves 50% of problems is pretty damn good!

SEJeff · on June 4, 2022

The architect of AWS and author of Dynamodb has very good thoughts on formal methods. They’re worth a read:

https://perspectives.mvdirona.com/2014/07/challenges-in-desi...

colmmacc · on June 4, 2022

Oh man, I really can't resist adding to the record a bit here. The author of the article - Marc Brooker - has worked (and led!) on formal methods at AWS and DynamoDB much more than James. The paper James linked to in his post was authored by ... Marc Brooker! so this comment just reads funny to me. Even though I know it wasn't intended that way.

James isn't the architect of AWS either, the scale we're at and the way we organize ourselves mean we don't really have any single architects of anything. Also EC2, S3, SQS, and more were in place by the time James joined. But I'm not trying to talk down James, because James has been much more influential and important to AWS success than an architect! His technical and business insights, his vision and drive, and his focus on some really very key levers have been monumental drivers of AWS. I know I've learned a lot from him, and still do. He's so good at it that he's now on the Amazon leadership team, reporting directly to our CEO!

SEJeff · on June 4, 2022

Ah well you would know much better than me! I heard someone mention James as the architect of aws and recall him mentioning designing dynamodb. Did he do that, or am I universally wrong sans his love for formal methods?

Thanks for the clarification though. This is truly why I enjoy HN so much. The ability to get corrected by the experts is an honor :)

csours · on June 4, 2022

The longer I spend in IT/Software Dev, the more I see a dichotomy between Correctness and Usefulness. You can certainly have both, but different people concentrate on either correctness or usefulness. I think the future of software dev is people who understand both.

justahuman74 · on June 4, 2022

The tools make it very hard to do both at the same time. To pick a topic, is there something I could write shell-style scripts in that leans towards correctness?

pharmakom · on June 4, 2022

Yes! F# scripts

Some features:

- open-source

- cross-platform

- quick editing with syntax highlighting, type annotations etc. in VS Code

- run in a single command (‘dotnet fsi script.fsx’)

- type-safe but with global type inference

- grab dependencies using directives in the script (no project manifests or global dependencies)

The added safety of a typed language is fantastic for scripting because it’s not worth writing tests for most scripts.

simonmic · on June 5, 2022

Haskell (using https://docs.haskellstack.org/en/stable/GUIDE/#script-interp...).

The downside: requires 1. installing stack and 2. waiting for it to install ghc and libs on first run of the script.

csours · on June 4, 2022

Correctness needs a specification. Correctness hates side effects. Shell type scripts are all about side effects.

westurner · on June 4, 2022

From "Discover and Prevent Linux Kernel Zero-Day Exploit Using Formal Verification" https://news.ycombinator.com/item?id=30194993 :

> Can there still be side channel attacks in formally verified systems? Can e.g. TLA+ help with that at all?

Side-channel attack https://en.wikipedia.org/wiki/Side-channel_attack

_vdpp · on June 4, 2022

Yes, there can still be side-channel and out-of-band attacks in FV systems. For example, pouring water into the server rack. That’s a cheeky example, but things like timing side-channels via speculative execution and cache hits wouldn’t be addressed by FV at the software level. I think there is some ongoing research for verifying software/hardware interactions but I’m not aware of a product in widespread use that checks this.

The use of constant-time algorithms for crypto is a way to mitigate things like timing side-channels, and there are other techniques like disabling hyper-threading that can help.

The depressing fact is that side channels exist any time there’s a shared computing resource. The question is how noisy the channel is, the data rate, etc.

westurner · on June 4, 2022

It might be argued that side channels exist whenever there is a shared channel; which is always, because plenoptic functions, wave function, air gap, ram bus mhz, nonlocal entanglement

Category:Side-channel attacks https://en.wikipedia.org/wiki/Category:Side-channel_attacks

xupybd · on June 4, 2022

My work is often little internal tools and a few b2b websites. I hate down time due to bugs. I still get nervous when I install to production. Always taking back up after back up and talking myself through the process of rebuilding from scratch. Just in case this time I have to.

Would formal methods help with code at this level?

I suspect the cost benefit isn't there. Though I wish it was. I've seen the difference between having tests and not having them. Development time decreases, and my stress levels decrease. Is this the same with formal methods?

subjectsigma · on June 4, 2022

I work in government contracting with very expensive and heavy pieces of metal that kill people if something goes wrong, and even I can't convince people they need to use formal methods to help write their code. So the easy answer is 'no'.

That being said, FM is an enormous spectrum; if you squint a little, writing really good user stories could be considered a 'formal specification'. There's a really good paper from Amazon about model checking (Newcombe 2014 IIRC) where they make the bold statement that "Doing the math didn't help us find that many mistakes, nor did using the model checker; the thing that was most helpful was sitting down and forcing ourselves to come up with really good definitions. Like what does X really mean?" So it doesn't have to be as complicated as you think.

So if you want to start as simple as possible, I would recommend something like this: https://www.hillelwayne.com/post/decision-table-patterns/

aeneasmackenzie · on June 4, 2022

Leslie Lamport had an essay about this. Without a spec, even if it's just "Serves the user's homepage", your program is not even wrong.

xupybd · on June 4, 2022

Thanks! I'll have a read. Your comment helps clarify the situation.

rramadass · on June 4, 2022

You may find https://markusvoelter.medium.com/the-evolution-of-decision-t... helpful. He also has a lot on DSLs which is worth reading.

There is also a more powerful version of Decision Tables called Tabular Expressions (from D.L.Parnas) which you can use.

A nice succinct reference to Formal Methods is Concise Guide to Formal Methods by Gerard O'Regan.

I think simplified versions of Formal Methods can be used profitably for your use-cases. Start with Assertions, Design-by-Contract and Decision Tables.

AkshatM · on June 4, 2022

Formal methods will likely not help you - they can only show a _specification_ is sound, but not the _implementation_.

That is, formal methods aren't testing the code you write, just the design concept behind it. For the design concepts you are likely to encounter in your situations, you absolutely do not need a formal proof that your stuff works - unless you are inventing (e.g.) a new cryptography suite from scratch or a large distributed system that must handle a massive amount of concurrency, there is zero advantage you would get over following best practices.

InefficientRed · on June 4, 2022

> Formal methods will likely not help you - they can only show a _specification_ is sound, but not the _implementation_.

I'm not sure where this idea came from. Are you in hardware? I guess in hardware that makes sense.

Anyways, in software, formal methods can absolutely show that an implementation meets a specification. In fact, that has always been one of their canonical use cases in software systems, at least since the 70s or maybe early 80s.

Formal methods can also be used with very ad hoc specifications, or with very general specifications, or increasingly without any explicitly specification at all (eg by inferring specifications).

> cryptography suite from scratch or a large distributed system...

Or aerospace software, or automotive software, or certain financial software, or industrial control systems, or any complex billing/access control, or...

> there is zero advantage you would get over following best practices.

I wrote a bunch of static analyses that checked for common configuration errors in Django apps which saved us many weekends. YMMV.

AkshatM · on June 4, 2022

I'm in software.

> Anyways, in software, formal methods can absolutely show that an implementation meets a specification. In fact, that has always been one of their canonical use cases in software systems, at least since the 70s or maybe early 80s.

To be clear, I'm talking about systems that go beyond simple property validation. Presumably, we're both talking about e.g. Ada SPARK, Alloy, TLA+, etc. I would hesitate to call a type system a formal method, for example, unless they use some notion of tactics like Idris, Coq and Lean do.

Now there certainly exist formal methods that can show that a program meets a given specification and conforms to certain desirable properties. However, they cannot show that the program is bug-free, because many properties of a formally-verified system may be relaxed in real environments. When I say "sound", I mean the property of being bug-free.

In OP's case, it would not help the OP reduce the frequency of bugs pushing to production - it would only give them confidence that they meet the spec, which is usually not what people writing internal tools are worried about. For industries where being safety-critical is not necessary, there is simply no meaningful advantage provided by formal methods over regression tests, clean refactoring and validators. While you can certainly use them, the benefits are slim.

> Or aerospace software, or automotive software, or certain financial software, or industrial control systems, or any complex billing/access control, or...

These are safety-critical industries, where things like temporal correctness, multiprocessor safety and other attributes matter - essentially, where you can model your system as process calculi and must demonstrate some invariants hold over the lifetime of the system given a specification. That is absolutely the domain of formal methods and where some formal verification is needed for certification. These industries are also the ones where you can ensure both certified hardware and software, so that formal verification actually does detect correctness bugs beyond what an integration test does.

But OP has not said they are in any of these industries, and I don't think it's incumbent on me to be exhaustive in listing them when it's not core to OP's question. If OP's asking this question online instead of their colleagues, we can safely infer they're likely not in an industry that demands formal certification already.

InefficientRed · on June 4, 2022

> Presumably, we're both talking about e.g. Ada SPARK, Alloy, TLA+, etc.

It's quite difficult to look at systems like Dafny or {J,C}BMC, in particular, and say "they can only show a _specification_ is sound, but not the _implementation_.", since those systems are definitely saying something about the implementation.

Regarding "soundness", I think you're confused about terminology. I have no idea what it would mean for a spec to be bug-free; that seems equivalent to saying that a program is bug-free. Specs can have bugs for the same reason that implementations can have bugs.

Anyways, "sound" has a well-understood meaning that is quite different from the one you're imposing. Soundness is a property of the entire system, not of individual formulas (specifications). A formal system/tool is sound if the system only admits proofs of a formula F whenever F is valid in the semantics.

> But OP has not said they are in any of these industries, and I don't think it's incumbent on me to be exhaustive in listing them when it's not core to OP's question.

You're missing the point, which are the ellipses.

OP's question is "are formal methods worth it for me?"

The appropriate answer has NOTHING to do with the type of code that OP is writing or their application domain. AWS components are not safety-critical in the sense of aircraft or automotive software, and they run on commodity hardware. Ditto for financial systems. But formal methods are applied successfully in both contexts.

Again, I've used hand-rolled formal methods for Django apps as part of our deploy process and it's definitely saved us headaches.

The implementation effort took very little time. A weekend and then a bit of ad hoc follow-up work. The rise of amazing parsers, really good implementation languages for this sort of code (scala, F#, ocaml), and fantastic SMT solvers (particularly string constraint solving) means that what used to require a "huge investment" can now be done in a weekend hackathon. Hardware is also making formal methods a cheaper prospect. Lately I've been playing with a small cluster of 64 core processors during off-hours and the amount of brute force search you can do in a weekend is pretty incredible.

Formal Methods are often cheap and easy to implement, but they do have a high up-front human capital cost/learning curve. You need to know how to coalesce a class of bugs into a mathematical description. And then how to check that mathematical description against code by hacking together parsers, AST rewrites, and SMT solvers. The idea that this takes months or years of even weeks of time is pretty dated. A learned hand can get a lot done in a weekend.

sangel · on June 4, 2022

You can formally verify all the way to C, C#, Haskell, or even assembly if you use tools like Dafny, Coq, or Vale (for verified assembly). Several projects do this. It’s a lot of work for sure though.

AkshatM · on June 4, 2022

Dafny is a really cool find, thank you!

tluyben2 · on June 4, 2022

Idris 2 writes pretty quickly and you can prove the parts you want. It’s not blazingly fast but fast enough for many purposes and I definitely have far far less bugs and generally better code. Only for hobby projects so far but it is quite a brilliant.

It is verified implementation.

AkshatM · on June 4, 2022

If you are afraid of deploys, you need to lean into CI/CD and the whole DevOps movement more. That is the solution to being able to deploy without stressing about it.

yakshaving_jgt · on June 4, 2022

I don’t think FM methods make economic sense in your case. They might if you were writing a distributed system, but writing some B2B websites probably means you’re best served with a monolithic system with both tests and types.

ncmncm · on June 4, 2022

Half would be damn good. Better would be solving the hard half.

dym_sh · on June 4, 2022

shrodinger's hard half – its always the one left unsolved