Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Verdi – Formally Verifying Distributed Systems (2016) (uwplse.org)
166 points by mindcrime on July 9, 2017 | hide | past | favorite | 32 comments


Currently stuck between a rock and a hard place with things like this.

Many of the other formal verification tools make it very easy to have your implementation drift or be entirely unmoored from your specification, but they let you keep working at a level of abstraction from your problem that's still very familiar. Though things like SMACK and DiVinE are helping to decrease the gap between spec and code.

Using things like Coq + program extraction brings the overlap between spec and implementation into much, much closer alignment, but brings with it additional problems. Writing a complex program in a very abstract language further away from an engineer's typical problem domain, being fairly limited in types of languages supported for extraction, and/or still having to have an awful lot of faith in the extractor (which itself is unverified as near as I can tell) are all things that are currently keeping me out of Coq for immediate use cases.

The good news though is that there's a lot of fairly high-profile work being done (like Verdi) to increasingly bring formal methods to increasingly complex software problems in ways that make using formal methods more approachable and usable, and that's truly wonderful.


Try TLA+ (learntla.com) for concurrency, Alloy Analyzer for structure, and SPARK or Frama-C for code-level stuff. See if you can feed TLA+ or Alloy properties into stuff like SPARK. E-SPARK did that with Event-B. These are the low-hanging fruit for verification vs full build out in Coq or Isabelle. Then, you write checkers for it to apply testing like KLEE, property-based, or fuzz testing.

These will get you close to full verification with less work. Each of the three languages has been used in a lots of CompSci and industrial work. Note that SPIN is similar to TLA+ and you can Google its industrial use for ideas of TLA+ power in concurrency analysis.


Java also has code-level specification language called JML, with a rich set of verification tools[1] (in fact, Frama-C's specification language, ACSL, was heavily inspired by JML).

Also, if you know TLA+, there's little reason to learn Alloy as well, as TLA+ is a superset of its features (as well as of Event-B, and you can compile TLA+ to Event-B to use its tools[2])

SPIN is not so similar to TLA+, as its language, Promella, is relatively low-level, while TLA+ allows an arbitrary level of detail, and so is well suited to specify very large, very complex applications, rather than small, well-contained ones. Also, if you absolutely must write proofs and have the considerable resources, I believe that Promella requires proving in Isabelle (which takes a long time to learn), while TLA+ has its own user-friendly proof language that hides Isabelle under the covers.

[1]: E.g. http://www.openjml.org/ and there are others.

[2]: https://www3.hhu.de/stups/prob/index.php/The_ProB_Animator_a...


Re Alloy

Most people that use these tools told me TLA+ and Alloy do different things with their automated tooling. They usually say TLA+ looks at ordering of things with Alloy looking at structuring. I only found a few people comparing them directly, though. Do you have a resource showing TLA+ does same thing as Alloy or as easily?

Re SPIN

Im saying they were used similarly. You wrote in one artocle like set theory, model-checking for temporal errors, and so on are new. That was the default in high-assurance decades ago using things like Z specs with SPIN for concurrency. Then, they tried to go for verifying high-level code, then rough code, code then assembly, and so on with refinement proofs. However, it started with algorithmic verification with specs in set theory, first-order logic, and ASM's. Writeups like yours tell me Lamport's concept was to similarly use those easier formalisms leaving off anything but the algorithm. Then add more automation to get more bang for buck for more developers. A good idea since the subset of that which Z and SPIN offered had already prevented serious, sometimes deep, errors in prior systems.

Now, far as SPIN, it was used to chech the properties of concurrency models, hardware specs, caching, protocols, standards, and distributed systems. Industry found bugs in all these things using it with relatively little training vs other methods that required rare specialists. Sound familiar? TLA+ now shows up as tool of choice for quite a few of those areas with SPIN still used a lot due to maturity & being well-known. That's why I said they're similar: people are specifying and model checking similar things with them similarly with much less effort than provers.

The easier proving on TLA+ and high-level nature are real benefits ovee SPIN. It's why I recommend TLA+. ;)


> Do you have a resource showing TLA+ does same thing as Alloy or as easily?

See, Why Amazon Chose TLA+[1]. They found Alloy pleasant enough but a bit weirder than TLA+, and TLA+ is richer in terms of expressivity and features. I learned Alloy and played with it after knowing TLA+ well, and while I think it is an extremely nice and useful tool (although its syntax is indeed weird), it is absolutely trivial to express anything you can in Alloy in TLA+, and that's without even TLA (i.e., just the +). Their tooling is a bit different, so I guess there may be cases where Alloy would be more convenient even if you already know TLA+.

> Im saying they were used similarly. ...

Yeah. Deep formal methods are, in general, much more similar to one another than they are different.

[1]: https://link.springer.com/chapter/10.1007/978-3-662-43652-3_...


The only in-depth study I know about is "Alloy meets TLA+".[1] Their conclusion was that Alloy is much more flexible and easier to use _unless_ your system is dynamic, because you have to manually model time in Alloy.

[1] https://arxiv.org/pdf/1603.03599.pdf


Design verification using TLA+ and Alloy is a completely different game than verification at the level of code as in Verdi. While there are plenty of studies saying design verification is good in general, I'm not aware of any studies saying that (or quantifying to what degree) it gets you close to implementation verification. On the flip side, the classic paper on the implementation of Paxos hints at just at how wide the gap is between design and running systems: https://research.google.com/archive/paxos_made_live.html

While Frama-C style verification works well out-of-the-box for proving single-threaded, isolated programs correct, one has to make a major effort to capture assumption-heavy distributed systems behavior using ghost code and other tricks. The only effort I'm aware of in this direction is the following, which uses the VCC tool: http://madhu.cs.illinois.edu/evcons.pdf


> Writing a complex program in a very abstract language further away from an engineer's typical problem domain, being fairly limited in types of languages supported for extraction, and/or still having to have an awful lot of faith in the extractor (which itself is unverified as near as I can tell) are all things that are currently keeping me out of Coq for immediate use cases.

Indeed, which is why I think something like Idris will probably be the future. We need to combine programming and theorem proving, and allow us to move the line incrementally closer to or further away from full verification as the problem evolves, becomes better understood, or new techniques emerge.

Full verification for most programs is still too difficult, but we should be able verify what's feasible and use standard testing techniques for the rest. It will still be a huge leap towards real software engineering.


> Indeed, which is why I think something like Idris will probably be the future.

I very much doubt it. I think other code-level specification languages (like JML for Java or ACSL for C), are much better suited to the configurable level of verification that you're looking for. The reason their tooling[1] is better than Idris's is not just because they're more mature, but for deeper reasons, too.

While working in different formalisms (dependent types or direct logic) doesn't make much of a difference, when you specify in types, the lowest level of confidence you allow is formal proof, which, as you note, is beyond reach for anything but small and simple code. It doesn't make much of a difference whether you need to prove a JML specification or an Idris type, but a JML specification allows you to optionally use weaker but much cheaper forms of verification like concolic testing, regular test generation etc.

OTOH, Idris already has test-generation libraries that require yet another form of specification, just because types won't work with anything less than proof. I heard you could achieve the same with types, but that requires enclosing the weakly-verified types in monads, so type T and type `Weak T` is not the same.

Idris is one of many (I've lost count) of experiments in code-level formal verification. So far I don't see any indication it is closer to being "the future" than any of the multitude of other experiments. I don't think it's necessarily worse, either (while suffering from the problem I mentioned above, I assume it has advantages, too), but right now if you hear a lot about Idris it mostly means you hang around Haskellers, who have taken a liking to the language because of its similarity to Haskell in both spirit and style.

[1]: See here for an example of some automated JML tools http://www.openjml.org/, and here for semi-automated tools (using proof assistants) http://krakatoa.lri.fr/


> It doesn't make much of a difference whether you need to prove a JML specification or an Idris type, but a JML specification allows you to optionally use weaker but much cheaper forms of verification like concolic testing, regular test generation etc.

I think you're overshooting the target I was aiming at. You can use simple algebraic types in Idris and check your invariants dynamically as you would in any less expressive language, then incrementally move some of these to statically checked invariants where feasible.

Specification-tools require a lot more work in separate specification languages, which are often unwieldy.

> if you hear a lot about Idris it mostly means you hang around Haskellers, who have taken a liking to the language because of its similarity to Haskell in both spirit and style.

To some extent that's true. So take Haskell, remove the default lazy evaluation which trips so many people up, remove all the restrictions around types since it supports full dependent typing (no more whacky type trickery to check some invariants statically!), support for syntax extensions to support EDSLs, and easy integration with C. Welcome to Idris.

Frankly, this is pretty much the full wishlist that advanced programmers have been wanting for years, and I've seen a lot of interest in Idris since its recent 1.0 release.


> You can use simple algebraic types in Idris and check your invariants dynamically as you would in any less expressive language, then incrementally move some of these to statically checked invariants where feasible.

That simply means writing less specifications, or writing specifications in two different ways. Spec languages allow you to decouple the specification from the verification effort. You write some specs (or not), and then you decide whether you want to prove them, generate tests, add assertions etc.

> Specification-tools require a lot more work in separate specification languages, which are often unwieldy.

They require exactly the same amount of work (whether you write your spec using types or logic doesn't make a difference; the specs are the same), and the language is mostly the object language (i.e. the programming language) with the addition of some quantifiers. The result is usually not much different than a language with a rich type system.

> Frankly, this is pretty much the full wishlist that advanced programmers have been wanting for years, and I've seen a lot of interest in Idris since its recent 1.0 release.

I think this is the wishlist of programmers who like Haskell... Formal methods people don't agree at all that this is the future.


> Spec languages allow you to decouple the specification from the verification effort.

I think the recorded history of comments drifting from source code proves that this isn't likely to work. The properties to verify should be expressed within the same code so they must evolve together.

If specification and verification are a separate step, those steps may not happen at all because the program will run without them. Verification must be part of the normal workflow of expressing a program and getting it running, or it just won't happen.

> I think this is the wishlist of programmers who like Haskell... Formal methods people don't agree at all that this is the future.

If you see specification and verification as separate problems from programming, then it seems only natural to see a divergence down the road and not a convergence.

And of course, you're probably still thinking that verification efforts would only be applied when absolutely necessary, because of the inherent cost, where I'm foreseeing a future where some verification is part of most programmers' workflow, not just high assurance systems, and the cost for some common set of verification efforts are low and standardized.

We're already part way there with the pervasive use of static type systems, and the work on refinement typing, as in Liquid Types, brings us even closer.


> The properties to verify should be expressed within the same code so they must evolve together.

But they are. Remember, those specifications are continually verified against the code, either as proofs, tests, automated proofs, etc..

> If specification and verification are a separate step, those steps may not happen at all because the program will run without them. Verification must be part of the normal workflow of expressing a program and getting it running, or it just won't happen.

Specification and verification are always separate steps. You write tests, then you need to run them. You write types/specs, then you need to prove them. If you're concerned that the verification isn't done by the compiler, then that's the whole point. You want the verification to be pluggable, otherwise, you're limited to unaffordable formal proofs. As to "it just wouldn't happen", this is exactly how people using formal specification today do it; in fact, everybody does it. Your tests aren't run by the compiler, yet people plug them into their build process.

> If you see specification and verification as separate problems from programming, then it seems only natural to see a divergence down the road and not a convergence.

Specification and verification must be separate, or you'll need to write a new spec depending on how you want it to be verified. Indeed, this is how it's done in Idris. People specify those properties they want to verify with proofs using types, and they specify them in a different way when they want to verify them with automated test generation. That doesn't seem like a convergence to me.

> We're already part way there with the pervasive use of static type systems, and the work on refinement typing, as in Liquid Types, brings us even closer.

If you think that the use of dependent types even remotely resembles the use of inferred types, then I don't think you have much experience with formal methods, and dependent types do take you from the realm of programming into the realm of formal methods. It's like saying that we're part way to a particle accelerator in every home thank's to the pervasive use of microwave ovens. Just to give you a sense of the difference, people have been writing ever growing, complex, and working programs in typed languages for at least 50 years now. On the other hand, no one -- even teams of Phd students -- has ever been able to write a program larger than 10KLOC that's verified at the same confidence level provided by dependent types, and those few programs were not only short, but heavily simplified. If you watch talks by Xavier Leroy, one of the world's premier experts on this particular form of verification (semi-manual, deductive end-to-end verification), you see that even he says that the approach doesn't scale in general to beyond a few hundred to about a thousand lines of code.

Idris -- like the scores of others of formal methods projects out there -- is an experiment in one particular approach. It's not a step towards some future (other than in the important research sense), because no one in the formal methods community has any idea what form the future of formal methods will take. I really wish some of the people who think Idris is the future would actually try using it to prove some properties of a tiny, 5-line quicksort algorithm -- say, that it actually sorts and that it runs in worst-case quadratic complexity. Here is a solution to the first part[1], and this is for a pretty trivial 5-line algorithm. How many developers can do that, do you think? All I know is that no one can do that for programs that aren't small and simple. Thankfully, this form of verification is not the only tool in the formal methods toolbox, but I really find it hard to be convinced that this way is the future.

[1]: https://github.com/bmsherman/blog/wiki/Quicksort-in-Idris


P.S.

If Idris (or a language like it) turns out to be significantly valuable in practice, I don't believe it will be thanks to its ability to formally prove deep correctness properties, for the reasons I gave above (as code-level specification languages do a better job of it), but because it may turn out that proving some weaker correctness properties (that don't quite reach the bar of formal verification but that exceed what's possible with weak type systems like Haskell's) empirically turns out to yield a significant improvement in correctness for a reasonable amount of effort. I certainly cannot dismiss such a possibility, but I don't think this is something anyone can foresee. It either empirically turns out to be the case or not.


> But they are. Remember, those specifications are continually verified against the code, either as proofs, tests, automated proofs, etc..

They are not though, they require separate steps using separate tools. These steps are not guaranteed to be taken, they require discipline, time and process, and history has shown that steps with those requirements are only taken when absolutely necessary, not as a matter of course.

You've argued that people already do this, but I submit that only those people already doing difficult verification already do this, which are not the people I'm talking about. I'm talking about ordinary programmers doing verification, where they can, as part of normal programming.

> Specification and verification are always separate steps. [...] You write types/specs, then you need to prove them.

Firstly, you don't always, given type inference.

Secondly, programmers writing a program don't consider themselves to be writing proofs, that's my point. The proof just follows naturally from reasoning about the semantics of the code as they're writing it, not later when they have to go back and read it who-knows-how-much-later, and then figure out what conditions they should try to verify.

Regarding pluggable verification, I'm not sold. You now have a mashup of formal properties with uncertain failure modes. Look at the work on RustBelt [1] to see how easily naive composition can break important safety assumptions.

[1] http://lambda-the-ultimate.org/node/5448

> Your tests aren't run by the compiler, yet people plug them into their build process.

Tests aren't a meaningful form of verification; they can't prove the absence of bugs, only their presence. Rarely is a property verifiable by exhaustive case tests.

> People specify those properties they want to verify with proofs using types, and they specify them in a different way when they want to verify them with automated test generation. That doesn't seem like a convergence to me.

They're writing everything in the same language, not different languages using different compilers and different tools. They can't run their program until their verification succeeds, because the program has no meaning without verification. That's pretty convergent compared to alternatives.

> On the other hand, no one -- even teams of Phd students -- has ever been able to write a program larger than 10KLOC that's verified at the same confidence level provided by dependent types, and those few programs were not only short, but heavily simplified.

Right, because dependent types typically leak too much of their implementation due to the properties that must be satisfied.

I think the interest in EDSLs show a way forward though, because verifying small programs is sometimes difficult but often feasible, and domain-specific languages can often be reduced to some simple kernels with some readily verifiable properties. Idris' syntactic support can then make this easy for library clients to use correctly.

Of course, you can suggest this is all speculative, but it's not pie-in-the-sky as you've been claiming. People have been doing this for years already, with both dependent and ordinary type systems (see Oleg's lightweight static capabilities).

> (as code-level specification languages do a better job of [proving deep correctness properties])

I'm frankly still confused by your position on this, because what are types if not a form of code-level specification? Code-level specifications that are not types tend to be fairly non-compositional, have few means of abstraction and reuse, and as I mentioned earlier, require extra discipline.

Finally, I never actually claimed that Idris is or will be some "final solution" to software verification. As I initially said, and later reiterated, we need to be able to incrementally move the line towards or away from full verification as feasibility dictates, during ordinary programming, which something like Idris permits. Certainly Idris may be too expressive, and something like dependent Haskell might strike a better balance in the end, but Idris strikes a decent balance and exists right now.

> I really wish some of the people who think Idris is the future would actually try using it to prove some properties of a tiny, 5-line quicksort algorithm -- say, that it actually sorts and that it runs in worst-case quadratic complexity.

These aren't the typical properties you want to verify in most software, so I'm not sure what you think this proves. More typical properties are the correctness of protocols, for which we might use session types, that a general ledger is always balanced, that resources are always cleaned up and so don't take down your service, and so on.

These are relatively simple properties for which we now have good, reusable abstractions, not sophisticated control-flow dependent properties you need to specify and verify all in one place as with quicksort.


> These steps are not guaranteed to be taken, they require discipline, time and process, and history has shown that steps with those requirements are only taken when absolutely necessary, not as a matter of course.

They require no more discipline than remembering to run your unit tests in your build. History has shown that people -- even ordinary programmers -- have come to incorporate their unit tests into their build so that they run automatically. A much, much bigger challenge would be to get programmers to write specifications in the first place -- whether in types or in logic.

> Firstly, you don't always, given type inference.

Those types that type inference can infer correspond to specs that automated tools prove. The algorithms are even the same.

> The proof just follows naturally from reasoning about the semantics of the code as they're writing it, not later when they have to go back and read it who-knows-how-much-later, and then figure out what conditions they should try to verify.

Whether you write your proofs using dependent types or using logic, the process is identical, and in either case it is so laborious that no one has been able to do it for large programs anyway, not before, not during and not after.

> You now have a mashup of formal properties with uncertain failure modes.

1. Those are not the same failure modes. 2. Welcome to the world of formal methods. Again, full proof of large programs has never been accomplished. We who use formal methods in real-world projects must use a mashup of techniques of various confidence levels.

> Tests aren't a meaningful form of verification; they can't prove the absence of bugs, only their presence. Rarely is a property verifiable by exhaustive case tests.

You sound like someone quoting something rather than a practitioner of formal methods. We use a plethora of tools, each contributing something to the confidence. We have no choice given that formal proof is beyond our reach for the properties that we must verify.

> They're writing everything in the same language, not different languages using different compilers and different tools.

No, those are different tools (compiler and test tool) and different languages (type level and object level; yes they use the same syntax, but code-level spec also does.

> Right, because dependent types typically leak too much of their implementation due to the properties that must be satisfied.

Dependent types are just a formalism for writing specifications. The problem is what kind of verification is done, and types require proof.

> Idris' syntactic support can then make this easy for library clients to use correctly.

No one has ever been able to make formal proofs even moderately hard, let alone easy. Maybe Idris will perform this miracle, but my bet is on other formal verification techniques. I am not aware of any formal methods researcher who believes that formal proofs will become affordable for mainstream software using any of the formal proof techniques we currently have.

> Of course, you can suggest this is all speculative, but it's not pie-in-the-sky as you've been claiming. People have been doing this for years already, with both dependent and ordinary type systems (see Oleg's lightweight static capabilities).

As someone who actually uses formal methods, I know what's been done and what's only been claimed in research. Lightweight static capabilites are not really formal methods; their tokens for runtime assertions. They're cool (and have been used in OOP languages since long before that paper), but they're something else altogether.

> because what are types if not a form of code-level specification?

That's right, but they're specs that settle for no weaker form of verification than proof.

> Code-level specifications that are not types tend to be fairly non-compositional...

Huh? It's just a different formalism. They behave the same, and require the same discipline. The only difference is that they don't necessarily require proof.

> More typical properties are the correctness of protocols, for which we might use session types, that a general ledger is always balanced, that resources are always cleaned up and so don't take down your service, and so on.

Session types are really weak, and can't actually verify interesting protocols. That resources are cleaned is a local property that doesn't fall in the category we're discussing (and can be achieved without formal methods at all). That a general ledger is always balanced -- well, that's a correctness property that I'd like to see someone prove in Idris in a real, large distributed system. It's never been successfully done.

> These are relatively simple properties for which we now have good, reusable abstractions

What we don't know is what is the relationship between having such weak guarantees and actual software correctness.


Regarding extraction in Coq, there's certicoq: http://www.cs.princeton.edu/~appel/certicoq/

Granted, there's a lot of engineering work left to do, but it's moving in the right direction.


If you're looking for programs verified down to the machine code level, using C and the Verified Software Toolchain (VST) is one option. C programs are proven in the Hoare style (pre/postconditions) inside Coq, and then compiled using the verified CompCert compiler. Properties proven in Coq are guaranteed to hold about the compiled code.

There is nothing fundamental to prevent using VST to prove Verdi handler code correct, but the current runtime is not set up for that workflow.

http://vst.cs.princeton.edu


I don't think having to trust the extractor matters much in practice, since you need to trust the compiler for extracted program anyway. Compared to OCaml compiler, Coq-to-OCaml extractor is simple and small.


Remember, two years ago James Wilcox and Doug Woos were formally proving Raft's linearizable semantics using Verdi: https://news.ycombinator.com/item?id=10017549


Yes; because it's mentioned in the article:

> Both the Verdi framework and our verified Raft implementation are open-source. Please feel free to submit comments, issues, and pull requests on either repository.

[1] https://github.com/uwplse/verdi-raft



Some new developments of the framework in the last two years:

- packaging via OPAM to separate framework code and system code

- support for specifying and proving liveness properties

- support for proving Abadi/Lamport style refinement mappings between systems

The following workshop paper gives an overview of current and upcoming work: http://conf.researchr.org/event/CoqPL-2017/main-verification...


It's awesome that these tools exist, but the verification codebase for Raft is 10s of thousands of lines of specification & proof files. Just installing verdi on linux nearly bested me (some interaction btwn ocaml package management, coq, and native code). I tried running the verification suite and gave up after 10 or 20 minutes of full CPU.

TLA has warts too -- the only way I could interact with it is via a custom java IDE.

It would be great to see lighter-weight, more accessible ways to embed proofs in real-world programs and run them in test suites.


> It would be great to see lighter-weight, more accessible ways to embed proofs in real-world programs and run them in test suites.

Not sure what this means exactly. Do you mean extracting the part of the proof that can't be verified statically, for whatever reason, as some kind of test suite?


I think I'm asking for something in between a mathematically verifiable proof and fuzzing. A lightweight way to translate runnable code into a symbolic execution environment that can be exercised (not necessarily verified) exhaustively or stochastically.

There are relatively few functions that I want to do this for, and I'm okay with doing some manual work to define the possible inputs and how the functions talk to each other (i.e. mock out network / filesystem / database).


On the lightweight end there's https://github.com/catseye/Maxixe


> TLA has warts too -- the only way I could interact with it is via a custom java IDE.

Why? What problems did you run into with TLA+?


couldn't figure out how to run https://github.com/ongardie/raft.tla


What do you mean by "run"? Check in the TLC model checker? You need to first write a configuration that you want to check. Lamport's Specifying Systems has full documentation for using the model checker from the command line. But if you're a beginner, you might be better off using the IDE, at least at first, as it helps you write a configuration very easily. Also, I would recommend starting with a simpler specification (there are lots of examples available). Otherwise, the checker would just work for a while and spit out "OK", and won't mean much to you.


I wrote a step-by-step tutorial to using the IDE. Hope it helps! https://learntla.com/pluscal/toolbox/


I liked the layered approach in the IronFleet paper. IronFleet: Proving Practical Distributed Systems Correct

https://www.microsoft.com/en-us/research/publication/ironfle...

It uses TLA for the distributed protocol part, and Hoare style verification for sequential code, which has excellent toolchain support.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: