Rules like avoiding recursion and loop upper bounds are partly justified by help...

wizeman · on Nov 4, 2016

Take a look at dafny [1] and F* [2]. Both are open-source and it appears that both of them are able to prove code to be correct with considerably less proof code than traditional interactive theorem provers (such as Coq or Isabelle), mostly thanks to the use of an SMT solver (Z3, in both cases) to aid in proving.

Both languages are being used in Project Everest [3], which is building a formally verified HTTPS stack, including a formally verified TLS implementation [4].

Unfortunately, there isn't a lot of documentation / beginner guides about these languages out there, but with the little there is, I was still able to start proving non-trivial (but still relatively easy-to-prove) code with them, even though I have no formal verification background whatsoever.

Just as a tip, I found the Dafny tutorial to be a lot easier to follow than F-Star's, but from what I could gather, F* appears to be slightly more general/sophisticated in what it can prove.

[1] https://github.com/Microsoft/dafny [2] https://www.fstar-lang.org [3] https://project-everest.github.io [4] https://mitls.org

Animats · on Nov 4, 2016

"F star is an ML-like functional programming language aimed at program verification. Its type system includes polymorphism, dependent types, monadic effects, refinement types, and a weakest precondition calculus. Together, these features allow expressing precise and compact specifications for programs, including functional correctness and security properties. The F star type-checker aims to prove that programs meet their specifications using a combination of SMT solving and manual proofs. Programs written in F star can be translated to OCaml or F# for execution."

And you wonder why you're not getting any traction.

Dafny, though, seems to have the right idea.

posterboy · on Nov 6, 2016

The field is hardly accessible due to the sheer variety of solvers. Getting an overview first seems more important then jumping right in and getting a feel for a particular solver.

lugus35 · on Nov 4, 2016

Open Source tools are way behind commercial software. Just look at PolySpace Code Prover for example:

https://mathworks.com/products/polyspace-code-prover/

ou PolySpace bug finder :

https://mathworks.com/products/polyspace-bug-finder/

michaelt · on Nov 4, 2016

Have you personally used those tools? On what sort of projects?

buttproblem · on Nov 4, 2016

Overall, one way to look at formal methods is that the tools range from fully automated (e.g., abstract interpretation, model checking), to manual (e.g., interactive theorem proving). Along this same range, the complexity of properties which can be proved also moves from simple (e.g., absence of buffer overflows) to complex (termination of parameterized systems).

As a concrete example, interactive theorem provers can prove the termination of Paxos for an arbitrary number of nodes (i.e., a system of `n` nodes will reach consensus, where `n` is a positive integer). But, automatically generating such a proof for something as complicated and parameterized as Paxos is an open problem ([1] describes an automatic tool working toward that goal).

Another thing to keep in mind is the goal of the tool: verification versus bug hunting. Verification aims to prove that a property will never be violated, while bug hunting aims to find a property violation. Here's a list of some types of tools off the top of my head.

Bounded model checkers such as CBMC [2] essentially search the entire state-space of the program; you can conceptually think of this as searching through a graph which has some start node, and you'd like to find a path from the start to an error. This is a bug hunting technique since the software/hardware may have an infinite state space (e.g., `while (true) ++i`), hence the search space must be bounded (e.g., unroll loops). For finite-state systems this can generate proofs of correctness. For infinite state systems, it can generate proofs up to a certain bound (e.g., correct for some number of clock cycles, correct for some number of loop iterations). The benefit of bounded model checkers is that when they find an error they generate a counter-example, which is essentially inputs to the program causing it to fail. So, you could use the counter-example as a concrete test case to fix the issue. Techniques such as counter-example guided abstraction refinement, and more recently IC3 and property-directed reachability make use of these counter-examples to generate proofs for infinite state systems.

A complementary technique is abstract interpretation, and numerical abstract interpretation in particular. One successful tool is Astree, though closed source, has been used to verify properties on Airbus planes. Abstract interpretation is complementary to something like bounded-model checking since it can be used to verify properties on infinite state systems. Numerical abstract interpretation can be used to generate numerical proofs such as, "on line 10, variable `x` is in the range `[-10, 10]`, or constraints between variables such as `x - y >= 10`. A difficult part with abstract interpretation, however, is that when it reports an error it may be the case that the error is a false-alarm and it doesn't actually exist. You can think of numerical abstract interpretation as typical "compiler algorithms" such as constant propagation but keeping track of much more complicated information.

Symbolic/Concolic execution, such as KLEE [4], Pex and Moles [5], and their successor IntelliTest [6], are sort of a middle ground between model checking, and traditional unit testing (i.e., hand writing inputs to the program to test it). The main goal is to automatically generate such tests in order to test behavior in the program (e.g., generate a set of test inputs to have 100% code coverage). The modern versions of these techniques make use of dynamic information (i.e., they actually execute the program under test), and use this information to make decisions about future executions. More concretely, you can imagine the program executing past some branch `if (c)` which was not taken (i.e., `c == false`) and then, the analysis asks the question: "what do the program inputs need to be such that `c == true`. An answer to this question generates new inputs and allows another test to be generated and explored.

Stateless model checking is another automated dynamic technique [7,8,9] used to explore non-determinism in concurrent programs. Essentially, it automatically generates "test cases" (i.e., new thread schedules) to efficiently explore the state space caused by a non-deterministic scheduler.

And there's a bunch of other techniques out there too (e.g., symbolic model checking). There's also a huge amount of work on practical test input generation for hardware. On top of all these techniques, there are there applications to solving specific problems, and heuristics to make them more practical and scalable. Overall, most of the techniques are niche and not widely used, but have been applied in various areas.

[1] https://www7.in.tum.de/~gleissen/papers/sharpie.pdf

[2] http://www.cprover.org/cbmc/

[3] http://www.astree.ens.fr/

[4] https://klee.github.io/

[5] https://www.microsoft.com/en-us/research/project/pex-and-mol...

[6] https://www.visualstudio.com/vs/release-notes/#Testing

[7] https://github.com/markus-kusano/SysTest

[8] http://www.srl.inf.ethz.ch/papers/oopsla15-modelchecking.pdf

[9] http://plrg.eecs.uci.edu/software_page/42-2/

iceManChild · on Nov 4, 2016

For an open source analyzer library you might look at IKOS (https://ti.arc.nasa.gov/opensource/ikos/).

Too · on Nov 5, 2016

Microsoft code contracts for c# is pretty cool.