Model checkers check deep Which deep properties have you got in mind? DPLL is ba...

pron · on Jan 25, 2020

> Which deep properties have you got in mind?

TLC is a model checker can check most properties expressible in TLA+, which are more expressive than any base lambda-calculus theory (e.g. it includes temporal refinement properties). See my overview on TLA+'s theory: https://pron.github.io/tlaplus

> in real implementations you mostly simply enumerate models, and backtrack (maybe with some learning) if you decided to abandon a specific model

So not deductive. You know that because DPLL does not directly yield a proof of UNSAT.

Cladode · on Jan 25, 2020

   can check most properties 
   expressible in TLA+

Lamport's TLA contains ZF set theory. That makes TLA super expressive. Unless a major breakthrough has happened in logic that I have not been informed about, model checkers cannot verify complex TLA properties for large code bases fully autom atically. Let's be quantitative in our two dimensions of scalability:

- Assume we 'measure' the complexity of a property by the number of quantifier alternations in the logical formula.

- Assume we 'measure' the complexity of a program by lines of code.

(Yes, both measures are simplistic.) What is the most complex property that has been established in TLA fully automatically with no hand tweaking etc for more than 50,000 LoCs? And how does that complexity compare to for example the complexity of the formula that has been used in the verification of SeL4?

   So not deductive.

So first-order logic is not deductive, because it doesn't yield a direct proof of FALSE?

There is an interesting question here: what is the precise meaning of "deductive"? Informally it means: building proof trees by instantiating axiom and rule scheme. But that is vague. What does that mean exactly? Are modern SAT/SMT solvers doing this? The field of proof complexity thinks of (propositional) proof systems simply as poly-time functions onto propositional tautologies.

pron · on Jan 25, 2020

> What is the most complex property that has been established in TLA fully automatically with no hand tweaking etc for more than 50,000 LoCs? And how does that complexity compare to for example the complexity of the formula that has been used in the verification of SeL4?

TLC, a model checker for TLA+, is commonly used to verify arbitrary properties -- at least as "deep" as those used in seL4, and often deeper -- of TLA+ specifications of a few thousand lines. Those properties can informally correspond to codebases with hundreds of thousands of lines, but there is currently no method for formally verifying arbitrary properties of 50KLOC, be it automatic or manual. Automated methods, however, are used either for deep properties (quantifier alterations) of few k-lines of spec/code or shallower (though still non-local and non-compositional) properties of 50-100KLOC. Manual deductive methods are virtually used for nothing outside of research projects/teams, except for local, compositional properties, of the kind expressible by simple type systems.

I clarified further here: https://news.ycombinator.com/item?id=22146186

> So first-order logic is not deductive, because it doesn't yield a direct proof of FALSE?

In any logic you can work deductively in the proof theory, or in the model theory. DPLL SAT solvers work in the model theory of first-order logic. Also, SAT solvers work on propositional logic, not first-order logic. Because of completeness, we know that every model-theoretic proof corresponds to a proof-theoretic proof, but that's not how most SAT algorithms work (except for resolution solvers).

> There is an interesting question here: what is the precise meaning of "deductive"?

It means using the syntactic inference rules of a proof theory, rather than the semantic rules of a model theory (possibly relying on completeness meta-theorems, as in the case of propositional logic, although that's not relevant for most interesting first-order theories). SMT solvers employ SAT, and can work on some first-order theories. I am not familiar with all the techniques SMT solvers use, but I believe they employ both proof-theoretic and model-theoretic techniques. In any event, as I mentioned before, SMT solvers are rarely used alone, because they're very limited. They are used as automated helpers for specific steps in proof assistants (TLAPS, the TLA+ proof assistant makes heavy use of SMT solvers for proof steps), for local properties of code (e.g. OpenJML for Java, in Dafny and in SPARK) or as components in CEGAR model checkers and in concolic test systems.

Cladode · on Jan 27, 2020

   TLC ... is commonly used to ... 
   at least as "deep" as those 
   used in seL4, and often deeper

What you are in essence implying here is that the SeL4 verification can be handled fully automatically by TLC. I do not believe that without a demonstration ... and please don't forget to collect your Turing Award!

One problem with model-checking is that you handle loops by replacing loops with approximations (unrolling the loop a few times) and in effect only verifying properties that are not affected by such approximations. In other words extremely shallow properties. (You may use different words, like "timeout" or "unsound technique" but from a logical POV, it's all the same ...)

    the model theory ...
    rather than the semantic 
    rules of a model theory

All mathematics is deductive. ZFC is a deductive theory, HOL is a deductive theory, HoTT is a deductive theory. MLTT is a deductive theory, Quine's NF is a deductive theory.

With that, what mathematicians call a model is but another deductive theory, e.g. the model theory of Peano Arithmetic happens in ZFC, another deductive theory. The deductive/non-deductive distinction is really a discussion of different kinds of algorithms. Deduction somehow involves building up proof trees from axioms and rules, using unification. It could be fair to say that concrete DPLL implementations (as opposed to textbook presentations) that are based on model enumeration, non-chronological backtracking, random restarts, clause learning, watched literals etc don't quite fit this algorithmic pattern. I am not sure exactly how to delineate deductive from non-deductive algorithms, that's why I think it's an interesting question.

   SMT solvers are rarely used alone,

I agree, but model checkers, type checkers for dependent types , modern testing technques, and (interactive) provers all tend to off-load at least parts of their work to SAT/SMT solvers which makes the opposition between deductive and non-deductive methods unclear.

         * * *

BTW I am not arguing against fuzzing, concolic, model checking testing etc. All I'm saying is that they too have scalability limits, just that the scale involved here is not lines of code.

pron · on Jan 27, 2020

> What you are in essence implying here is that the SeL4 verification can be handled fully automatically by TLC.

No, I am not. Read what I wrote again :)

> and please don't forget to collect your Turing Award!

Some people have already beat me to it. The 2007 Turing Award was awarded for the invention of model checkers that precisely and soundly check deep properties.

> One problem with model-checking is that you handle loops by replacing loops with approximations

No, that's what sound static analysis with abstract interpretation does. Model checkers are both sound and precise, i.e. they have neither false positives nor false negatives. To the extent they use sound abstraction (i.e. over-approximation), as in CEGAR algorithms, they do so to increase performance and to sometimes be able to handle infinite state spaces. A model checker, however, is always sound and precise, or else it's not a model checker.

> In other words extremely shallow properties.

Model checkers commonly check arbitrarily deep properties. Engineers in industry use them for this purpose every day.

> All mathematics is deductive.

Yes, and all (or much, depending on your favorite philosophy of mathematics) of it has models. If you have two apples in one hand and one apple in the other and you count how many you have, you have not done so using deduction in any logical theory. That had you used such a process you would have also reached the same result is because of soundness and completeness meta-theorems. That mathematics has both proof and model theories, and that they're compatible, is the most important result in formal logic.

> All I'm saying is that they too have scalability limits, just that the scale involved here is not lines of code.

All verification methods are limited by both size of code and depth of properties. For non-trivial property, the least scalable method, on average, is deductive proof. Which is not to say it is not useful when automated methods have failed.

Cladode · on Jan 28, 2020

Sorry, I got confused about loops in model checking, I was wrong about that. I don't know what happened, since I even have co-authored a model-checking paper where we do check loops.

Be that as it may, seL4 cannot currently be automatically verified, and it's not due to code size (8700 lines of C and 600 lines of assembler). It's hard to see what the obstacle to mechanisation could be but logical complexity.

Model theory is not some magical deduction-free paradise. The reason we care about model theory in logic is because we want to have evidence that the deductive system at hand is consistent. Building models of Peano arithmetic in ZFC for example, but also relating constructive and classical logic through double negation or embedding type theory in set theory, are all about relative consistency. It's easy to make a FOM (= foundation of mathematics) unsound, and logical luminaries including Frege, Church and Martin-Lof managed to do so. Those soundness proofs in essence transform a proof of falsity in one system to the other, and relative consistency is the best we can get in order to increase our confidence in a formal system. It is true that traditional model theory, as for example done in [1, 2, 3]. doesn't really foreground the deductive nature of ZFC, it's taken for granted, and reasoning proceeds at a more informal level. If you want to mechanise those proofs, you will have to go back to deduction.

[1] W. Hodges, Model Theory.

[2] D. Marker, Model Theory: An Introduction.

[3] C. C. Chang, H. J. Keisler Model theory.

pron · on Jan 28, 2020

> It's hard to see what the obstacle to mechanisation could be but logical complexity.

There is no impenetrable obstacle if humans could do it (for some definition of "could"), but the difficulty is handling infinite state spaces. Model checkers can do it in some situations, but the could do it in more. Neither model checkers nor humans can handle infinite state spaces -- or even very large ones -- in all situations (halting problem and its generalizations).

> Model theory is not some magical deduction-free paradise.

I just pointed out that some of the most successful automated methods, like model checkers (hence their name, BTW) and SAT solvers (which are really special kinds of model checkers), work in the model, not the proof theory.

> The reason we care about model theory in logic is because we want to have evidence that the deductive system at hand is consistent.

The reason we care about models is that we want to know that the "symbols game" of formal logic corresponds to mathematical objects; i.e. that it is sound -- a stronger property than consistency. The model existed first. Proof theory in deductive systems -- and formal logic in general as we know it -- is a 20th-century invention (well, late 19th but only accepted by mathematicians in the 20th).