Hacker News new | past | comments | ask | show | jobs | submit login
Proofs Should Repair Themselves (galois.com)
115 points by harperlee on Feb 25, 2021 | hide | past | favorite | 27 comments



This is my thesis area! If you're interested in proof repair, check out my latest work: https://arxiv.org/abs/2010.00774

More on my website: https://dependenttyp.es/

I hope more people work on this problem. It is a big and promising space. And I'll need students soon :)


I think this area is super interesting!

I am a regular software engineer with a background in math and CS. The most relevant experience I have is some tinkering with Haskell.

Do you have any suggestions for how a full time employee can engage in this world and possibly start doing research?


I'm not the parent commenter, but as a way to dip your toes in, I've heard Dafny[1] is probably the easiest language / proof system to get started with.

[1] https://rise4fun.com/dafny


Author here. If anyone isn't familiar with Galois, we're a consultancy that does a lot of formal methods and proof, which is why it would be great if proofs were easier to manage :)

If you're curious, you can get a sense of what we do here: https://galois.com/blog/2021/02/2020-year-in-review/

The tool we use most often for proofs is called SAW: https://saw.galois.com/


I was hoping that this would have some technical insights on how automated proof repair could be improved. The ideas presented just seem a bit pie-in-the-sky.

That said, I think the final point is the most important. Making proof repair more magical will just intensify the black-box nature of proofs to the programmer - and the more divorced specification and proof are from the people who actually write the code, the more likely that errors will slip in, not from bad code, but from bad specs instead.

IMO, the most viable path forward for "proofs which repair themselves" is proofs that the average programmer can understand - at which point, their repair will be as easy as writing the code, and you wouldn't need to worry about complex self-repair techniques at all.


Author here. This post is definitely more of a problem statement than a solution. I don't think this area gets enough attention and I'd like more people to think about it! However, I think there are some good reasons to think solving this is possible. E.g. Galois have had proofs in CI at AWS for several years that rerun on code changes - you can read about our approach here: https://link.springer.com/chapter/10.1007/978-3-319-96142-2_.... And elsewhere in the comments Talia Ringer posted about her research, which is very promising as well.

I think there's a tight connection between 'proofs the programmer can understand' and self-repair / automation. The aim should really be to clear away a lot of the scaffolding that's currently needed and hand it off to solvers, and leave programmers to do what they are best at, i.e. understand the meaning of the code. Type checking is a great example of a lightweight formal method that disguises most of the sophisticated automation hiding behind the scenes.


I feel obligated to point out that there are tools that effectively repair broken proofs, but they just aren't labeled as such. For example, Frama-C is a program that takes in a simplified written proof (in a C comment language), and then validates it (as in the tedious middle steps) with z3, cvc4, etc. Unfortunately, Frama-C can't handle dynamic memory yet, and it doesn't handle array sizes correctly forcing for loops over memcpys in some cases if you want to prove anything. I wouldn't recommend it for actual usability, but it does seem to "repair" proofs (as in they were never non-broken/whole in the first place).


These problems are more closely related than you imagine. In particular, when you change a program, there is often necessarily more information you must give the computer in order to make the corresponding change in the proof. That additional burden is creativity, and is in many cases unavoidable.

With good proof repair techniques you can fix the proof almost fully automatically in a way that mimics the change in the program, but then just ask the programmer a few questions to give the needed creativity to fix the proofs. As far as easy to understand goes, then, that "just" comes down to good UX for asking those questions. (Still not solved!)

Here is the state of the art right now (my most recent paper): https://arxiv.org/abs/2010.00774

Lots left to do but the dream is quite realistic.


> proofs that the average programmer can understand

Do you believe that every computer-generated human-illegible proof has an analogous human-legible proof?

It's my impression that much of the point of computational provers, is finding solutions to questions that humans would never otherwise answer in a million years, because the only proof of the conjecture (of a practical length/complexity†) is one structured in ways ill-suited to human mental processes.

† There might be a human-legible proof, but at 100x or 1000x the length of the human-illegible proof. It might be beyond human capacity to ever deduce such a proof; and it would also be beyond the capacity of the algorithms used in current provers to generate it. It would be "out of reach."


It's my belief that the large part of computer-generated proving, and the overwhelming majority of formal verification of programs, revolves around proofs that humans would easily understand.

While there are examples of mechanised proofs that reach the "limits" of human proof, they do so in domains where computers are good and humans are less good, like exhaustiveness checks for e.g. the four-colour theorem.

It should be pointed out that automatic provers are surprisingly stupid. Exhaustive searches do not scale well for large and complicated proofs, and computers obviously lack the "intuition" that mathematicians lean on to prove. Lots of the work that goes into computer-assisted proofs is just bookwork for properties that humans would consider trivial.

Even the OP's article is not concerned with computers generating proofs automatically (though lots of work has gone into that, especially w.r.t. code), but instead with computers repairing human-written proofs to persist them across small changes.

The thing that most certainly argues that the proofs involved in the formal verification of software are within human reach is simply that humans make such ad-hoc proofs all the time - because they are very similar to the non-formal specifications that humans use to write code in the first place. Any programmer who can argue why their code runs correctly is most of the way towards a proof of its correctness - the real obstacle is not the complexity of the resulting proof, but the tediousness of its expression.


Do we have any examples of proofs that are "unsuited to human mental processes," except by being long and tedious?


I don't know if there are proofs unsuited to human minds, but for machine proofs, there's no essential difference between what a proof assistant like Coq can do, and what a human brain can do. Ultimately Coq boils down to a very simple core logic that could (in principle) be checked by a human. But the sheer scale of such proofs means they can't be meticulously checked. Proof tools are just the same as e.g. compilers - they can do things humans can't do because to do so would be so incredibly time consuming and tedious.


An analogy might be chess with derided "computer moves." The aren't necessarily more complex that existing human heuristics, especially at the grandmaster level, but they lack at least one of several traditional "aesthetic" attributes.

I'm not all that familiar with generated proofs, but from those I've seen, they will point our pedantic linkages other proofs might avoid. I suspect that can easily be overcome though be building-in aesthetic conventions in the proof presentation layer, such as we do today with ASTs and compiler optimizations. Wild speculation: automated proofs use assembly constructs and goto's when they could be using higher order constructs - but this isn't always clear because there isn't a separation of abstraction layers (or a least a a clearly articulated one).

I could imagine a project that regenerated many well known proofs (these exist in form in the solvers already, but these are intentionally redone) without some of the existing heuristics abridgments that exist in the solvers. Then pair them with the human curated proofs of oft-considered elegance and train a system to convert from one to the other.


“An analogy might be chess with derided "computer moves." The aren't necessarily more complex that existing human heuristics, especially at the grandmaster level, but they lack at least one of several traditional "aesthetic" attributes.”

I think there’s a possibility that’s cultural. Generations of chess players grew up learning certain heuristics.

Now, computers show those heuristics aren’t as good as we thought they were. It’s possible that new generations, trained more by computers than by older generations of chess players, will develop new heuristics.

You see something similar in math. Sometimes, when a human discovers a new way to look at a problem, whole avenues of proofs open up, and what was incomprehensible becomes much easier not only for that human, but also for those learning about his work.

On the other hand, I think computers sometimes make moves humans won’t make because they don’t have any notion of fear.

When humans get into a minefield position where almost every move is clearly losing, but that do have one winning path through it, they may not choose that path even if they see it because they fear they’re overlooking something.


Here <https://cacm.acm.org/magazines/2017/8/219606-the-science-of-...> is an article about a computer-generated proof of a combinatorial problem in number theory. Although you could characterise the proof as "just long", I think that the article may offer a different perspective on why proofs of this nature are interesting and useful.


I don't think we'd currently have any way of producing such a thing, because the two processes we have are 1. producing proofs suitable to human mental processess and 2. producing long and tedious proofs by a series of steps programmed by humans in the first place.

Although it also depends to some extent on your definition of "suitable to human mental processes"... for instance, the famous proof that produced Graham's number is not "unsuitable to human mental processes" in the meta sense, but if you try to directly apprehend Graham's number with human mental processes, you fail. You can comprehend it and manipulate it mathematically, but you can not apprehend it.


Long and tedious is the definition of an unsuitable human mental process.


Every design property of a human-created program likely has a human-legible proof. The program doesn't have that property by accident, a human wanted it to have that property and understood on some level how to make it have that property.


I have a fantasy about the linguists returning to place formal logic back where it belongs - towering over the collapsed heap of probabilistic-blackbox-cat-picture-classifiers. A constrained natural language system would then be loosed on the Federal Register, steadily chewing through a 100 year mess, identifying clauses that violate the growing model's internal coherence, and suggesting potential constraints where global satisfiability is possible. It is a nice dream.


This reminds me a lot of [EWD539], which was originally written as a parody about how ridiculous it would be if mathematical proofs were developed with the same kind of processes that were used for software development. At the time, the message is that programs should be developed more like mathematics, but now it seems like the actual vision Dijkstra described is more or less coming to pass.

[EWD539]: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/E...


what would the recommended approach be for someone that's building a piece of software today in a common language ( go, swift, kotlin, ...) and wants to start building proofs around his codebase ?


This is a great question, and unfortunately the existing proof tools are very fragmented when it comes to target languages. E.g. there are a lot of C / LLVM and Java tools, a few C++ tools, a few Rust tools. I don't know of any proof tool that is production-ready that targets Go, Swift, and Kotlin.

One thing I will say is that proof is usually an expensive technique, and not just for the reasons in my post. Even setting up a proof can be demanding. So it's worth asking where in the project it's worth applying this kind of assurance? For example, Galois often applied proofs to cryptography, which are security critical and self-contained in small pieces of code. Proof and formal methods aren't one tool, but rather a toolbox that can solve different assurance problems. There's a big difference from a scalable static analysis like Infer and a fine-grained proof tool like Coq or Galois' SAW tool.

One easy place that formal methods can be useful is in modeling features of a project design for consistency. E.g. this is useful for protocols, state machines and similar. This means you can 'debug your designs' before building them into software. If that sounds like it might useful, I would suggest taking a look at Hillel Wayne's website: https://www.hillelwayne.com


I think there's already an industrial-grade middleground that is used by at least 2 major automated and interactive proof environments: Why3. Spark and Frama-C use it and it seems part of their success is from it and the ability to use all the SMT engines that Why3 talks to, and also there are bridges to Coq and Isabelle. I've seen people use 'em!

Sadly most of us don't get to rewrite or greenfield stuff, and although Spark is designed to be very progressive and modular, the proof of code that wasn't "written with proof in mind" is still quite the effort.


There are few tools that are both geared to mainstream programming languages and reasonable to work with. But probably the best place to start is Infer [1]. It's marketed as a static analyzer, in other words a way to prove some properties, but the separation logic at its heart is a powerful and general program proof technique.

In time, I hope and expect that the RustBelt project[2] will become a practical tool to prove Rust programs correct. It's already found some bugs in the standard library, and the main focus is currently to firm up the semantics of the language. Its separation logic engine (Iris) builds on work from Infer.

Lastly, there are less mainstream languages that are better adapted to formal proving. To pick one, I'd probably go with Idris, as it is probably best at being a general purpose programming language in addition to a mathematical tool. But there are others. Lean is exciting because it is being used as the basis of an ambitious project to formalize mathematics, and its latest version (Lean 4) is trying to be a better programming language. Coq is mature and has been used for a lot of formal proving work (it underlies Iris, mentioned above), but is not a great executable programming language. ATS 2 generates extremely efficient code (it often wins benchmarks) but is inscrutable - probably only a handful of people in the world can write nontrivial programs in it.

[1]: https://fbinfer.com/

[2]: https://plv.mpi-sws.org/rustbelt/


I figure it can't hurt to also mention the Coq/software foundations style separation logic book. In it you build a sequential separation logic, and prove some properties such as soundness (rather than a concurrent separation logic like e.g. Iris above).

The related work section of the summary paper also has a rather extensive list of projects using separation logic, though i'll note I didn't see Infer [1], listed.

[1]: https://www.chargueraud.org/teach/verif/


I'd say if you are starting from a common language, you are already on the wrong path.

Doing proofs is so expensive, it is better to optimise your language for your proofs, than the other way around.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: