Try TLA+ (learntla.com) for concurrency, Alloy Analyzer for structure, and SPARK or Frama-C for code-level stuff. See if you can feed TLA+ or Alloy properties into stuff like SPARK. E-SPARK did that with Event-B. These are the low-hanging fruit for verification vs full build out in Coq or Isabelle. Then, you write checkers for it to apply testing like KLEE, property-based, or fuzz testing.
These will get you close to full verification with less work. Each of the three languages has been used in a lots of CompSci and industrial work. Note that SPIN is similar to TLA+ and you can Google its industrial use for ideas of TLA+ power in concurrency analysis.
Java also has code-level specification language called JML, with a rich set of verification tools[1] (in fact, Frama-C's specification language, ACSL, was heavily inspired by JML).
Also, if you know TLA+, there's little reason to learn Alloy as well, as TLA+ is a superset of its features (as well as of Event-B, and you can compile TLA+ to Event-B to use its tools[2])
SPIN is not so similar to TLA+, as its language, Promella, is relatively low-level, while TLA+ allows an arbitrary level of detail, and so is well suited to specify very large, very complex applications, rather than small, well-contained ones. Also, if you absolutely must write proofs and have the considerable resources, I believe that Promella requires proving in Isabelle (which takes a long time to learn), while TLA+ has its own user-friendly proof language that hides Isabelle under the covers.
Most people that use these tools told me TLA+ and Alloy do different things with their automated tooling. They usually say TLA+ looks at ordering of things with Alloy looking at structuring. I only found a few people comparing them directly, though. Do you have a resource showing TLA+ does same thing as Alloy or as easily?
Re SPIN
Im saying they were used similarly. You wrote in one artocle like set theory, model-checking for temporal errors, and so on are new. That was the default in high-assurance decades ago using things like Z specs with SPIN for concurrency. Then, they tried to go for verifying high-level code, then rough code, code then assembly, and so on with refinement proofs. However, it started with algorithmic verification with specs in set theory, first-order logic, and ASM's. Writeups like yours tell me Lamport's concept was to similarly use those easier formalisms leaving off anything but the algorithm. Then add more automation to get more bang for buck for more developers. A good idea since the subset of that which Z and SPIN offered had already prevented serious, sometimes deep, errors in prior systems.
Now, far as SPIN, it was used to chech the properties of concurrency models, hardware specs, caching, protocols, standards, and distributed systems. Industry found bugs in all these things using it with relatively little training vs other methods that required rare specialists. Sound familiar? TLA+ now shows up as tool of choice for quite a few of those areas with SPIN still used a lot due to maturity & being well-known. That's why I said they're similar: people are specifying and model checking similar things with them similarly with much less effort than provers.
The easier proving on TLA+ and high-level nature are real benefits ovee SPIN. It's why I recommend TLA+. ;)
> Do you have a resource showing TLA+ does same thing as Alloy or as easily?
See, Why Amazon Chose TLA+[1]. They found Alloy pleasant enough but a bit weirder than TLA+, and TLA+ is richer in terms of expressivity and features. I learned Alloy and played with it after knowing TLA+ well, and while I think it is an extremely nice and useful tool (although its syntax is indeed weird), it is absolutely trivial to express anything you can in Alloy in TLA+, and that's without even TLA (i.e., just the +). Their tooling is a bit different, so I guess there may be cases where Alloy would be more convenient even if you already know TLA+.
> Im saying they were used similarly. ...
Yeah. Deep formal methods are, in general, much more similar to one another than they are different.
The only in-depth study I know about is "Alloy meets TLA+".[1] Their conclusion was that Alloy is much more flexible and easier to use _unless_ your system is dynamic, because you have to manually model time in Alloy.
Design verification using TLA+ and Alloy is a completely different game than verification at the level of code as in Verdi. While there are plenty of studies saying design verification is good in general, I'm not aware of any studies saying that (or quantifying to what degree) it gets you close to implementation verification. On the flip side, the classic paper on the implementation of Paxos hints at just at how wide the gap is between design and running systems: https://research.google.com/archive/paxos_made_live.html
While Frama-C style verification works well out-of-the-box for proving single-threaded, isolated programs correct, one has to make a major effort to capture assumption-heavy distributed systems behavior using ghost code and other tricks. The only effort I'm aware of in this direction is the following, which uses the VCC tool: http://madhu.cs.illinois.edu/evcons.pdf
These will get you close to full verification with less work. Each of the three languages has been used in a lots of CompSci and industrial work. Note that SPIN is similar to TLA+ and you can Google its industrial use for ideas of TLA+ power in concurrency analysis.