Can the language of proof assistants be used for general purpose programming?

pyjarrett · on Oct 28, 2023

Yes. SPARK is a verifiable subset of Ada and can be mixed with it in the same project. There are SPARK crates you can use via Alire [1]. There's a SPARK implementation of the TweetNaCl crypto library[2]

You can use it for general purpose programming, but it's very difficult and various degrees of verification you can achieve with it [3]. I made a verified version of Rob Pike's simple C regex matcher in SPARK -- proving an absence of runtime errors (e.g. never raise an exception or hit an out of bounds condition).

[1]: https://alire.ada.dev/crates.html

[2]: https://github.com/rod-chapman/SPARKNaCl

[3]: https://blog.adacore.com/from-ada-to-platinum-spark-a-case-s...

pjmlp · on Oct 28, 2023

SPARK features have been merged into Ada 2012, so it is also available in regular code.

touisteur · on Oct 28, 2023

Ada2012 adopted the concept of design by contract, clamored by some of the biggest Ada users.

At the same time of this update of the language, a large effort was in progress to modernise the 'old' spark that used contacts and proof annotations in Ada code through some kind of comment-based language.

The 'merge' here was to make contracts and proof annotations actually executable Ada code (when it made sense).

SPARK is first a toolsuite that examines Ada code to prove properties on this code (variable initialisation, dataflow properties, absence of runtime retors, partial or complete functional specification). The proverbe tech being only so capable, they had to restrict the supporterd language features (calling it the 'SPARK subset') for a while (most of the restrictions have been lifted in the recent year such as dynamic allocation, exceptions, aliasing) and some very hard things improved too (e.g. floating point handling).

Also, to be complete, since executable contracts are not always enough, they introduced in the Ada language (for SPARK) what they call ghost code (seen and used only by the proof tools).

So, yes, in a way SPARK features were merged in Ada, but you won't be able to use them for proof without the proof tools, you'll have a very expressive form of runtime assertion. Some features actually cascaded back to the compiler and the actual Ada language implementations.

Also, to be easier to use for proof, the Ada2012 and 2022 language specs added a lot of sugar: quantifiers, if-expressions, case-expressions, raise-expressions, but also delta-aggregates... all greatly improving the expressivity of Ada.

bluGill · on Oct 28, 2023

I keep hear intriguing things about SPARK, but i'm not clear how I could try it in my projects. Part of this is I work with qt which already is different.

hiAndrewQuinn · on Oct 28, 2023

Is Ada actually used outside of the US military? I'd love to hear more, I've always thought that's where it was.

pjmlp · on Oct 28, 2023

Yes, as mentioned on sibling comment in Europe.

That is also VHDL, which is based on Ada, is also mostly used in Europe in detriment to Verilog.

FOSDEM tends to have a regular Ada room.

Besides military, there is avionics, transport, critical factory infrastructure, lots of stuff that falls under high integrity computing.

touisteur · on Oct 28, 2023

We also see some people from banks passing by sometimes (critical realtime trading room apps). And medical, drones or satellites payload startups.

Alas FOSDEM has changed its room attribution scheme for 'smaller' communities and IIUC there was no Ada room this year. Hopefully there'll be a space next year.

trealira · on Oct 28, 2023

In addition to what pjmlp said, it's also used by Nvidia, apparently.

https://blog.adacore.com/nvidia-security-team-what-if-we-jus...

SubjectToChange · on Oct 28, 2023

It’s surprisingly used a bit more in Europe than the US. You can look at AdaCore’s users/case studies to get a brief overview of industries using it.

hirple · on Oct 28, 2023

I've been interested in this question myself - and recently wrote a basic web 'microframework' for Lean 4 [1].

I've loved being able to use Lean's macro system to write JSX-like HTML:

  def header :=  <header class_="header">
    <h1>todo</h1>
  ...

Ironically I've come up a bit short on how to use dependent types to do anything more advanced than matching handlers' to route types, which Servant [2] can do fantastically already.

I would love to be able to write types that describe, e.g., assumptions made about external resources, so that I can prove that "assuming my DB has such-and-such a latency, my page will always render in such-and-such a time" or something similar - but it's a bit beyond me at the moment.

[1] https://github.com/alex-wellbelove/LeanServer [2] https://hackage.haskell.org/package/servant

ykonstant · on Oct 28, 2023

Before taking the plunge on using dependent types in production, please heed the warnings discussed here: https://lean-lang.org/functional_programming_in_lean/depende...

rabbits77 · on Oct 27, 2023

I can’t see why this should not be the case. Logic Programming languages like Prolog can be used for general purposes. The main catch is that you need to be really good at Prolog to use it to full generality. But if you’re willing to put in the time it should be able the same.

mostlylurks · on Oct 28, 2023

Understanding the techniques that make general purpose programming in prolog easy is one of the easiest parts of learning prolog, such that even beginners shouldn't have trouble with it. Learning to use prolog the way it was meant to be used is actually much harder, and you'll certainly have a good grasp of general-purpose programming in prolog long before you become "really good" at prolog.

Jaxan · on Oct 28, 2023

My feeling is that proof assistants are much closer to general purpose programming than logic programming such as prolog.

sargstuff · on Oct 28, 2023

Just add a GPT natural language to language proof assistant translator as a front end (with appropriate audio device permissions)

Change the need to be really good at proglog to just need to distinctly describe what want/need.

schoen · on Oct 28, 2023

Adding a large language model is going to move you in the exact opposite direction from ease of formal verification.

epgui · on Oct 28, 2023

I mean... Not necessarily? It depends how you use it or what part of the process you delegate to it, no?

Whether AI produces Lean code or a human produces Lean code... The Lean code should speak for itself. Both humans and AI are the weak link in either case, and it's unclear to me which of the two is the weaker.

schoen · on Oct 28, 2023

Oh, I think I misunderstood the proposal there. I thought the language model was meant to be producing (only) code, not proofs.

If it manages to produce successful proofs of pre-existing property specifications, then that's great!

ratmice · on Oct 28, 2023

Yeah, I think the catch is the "pre-existing property specifications" part. I think you need to carefully understand both AI & Human generated propositions. While proofs once checked are just gravy.

jfoutz · on Oct 28, 2023

I've been thrashing around with agda (as a hobby) for a few months. Maybe it's time to take lean for a spin.

My thinking is still pretty fuzzy. But I feel like there's something important and valuable with the idea of "this is the specification of the software" and the "this software meets that specification".

Now, yes, maybe the specification is wrong. Proofs are hard to write. All software has a bunch of implicit constraints. I forget them for the code I wrote, and it's even tougher to figure them out when I'm handed a legacy project where the author is long gone.

There's something really important about that dual view of, this is what it's supposed to do, and this is what it does. And tools can help check that they agree!

There's like the classic Haskell example of verified email - at some point the user was sent an email and they clicked a link, so that string is special, pretty sure foo@example.com is a real email.

The ability to slap a type on something, then have tools tell you "this is everything that needs to be updated" is really appealing. Like, The account must be in good standing before shipping product. So I keep puttering around with it.

It's often difficult to add features and be super sure you're not breaking stuff. Well, maybe not you, but me, for sure.

cortesoft · on Oct 28, 2023

> There's something really important about that dual view of, this is what it's supposed to do, and this is what it does. And tools can help check that they agree!

Isn’t this the whole concept behind writing tests?

jfoutz · on Oct 28, 2023

It is! And I'm a big fan of test systems that optionally invert the logic and require failure - to weed out bullshit tests.

But there are some things that are challenging to test for. Tests sample the space, proofs define the space. you can't build it wrong. (you can sure have a bad spec, and build the wrong thing though)

Tests definitely help. It sorta feels like there's value left on the table. The spec, the types let you define broader constraints, and you catch them all.

Yoric · on Oct 28, 2023

I'm one of the people who believe that TDD and safety-by-design (which is based on strong, static typing) are actually two sides of the same coin. In a sufficiently powerful type system, they could possibly even be two implementations of the same thing.

- Start with spec. - Write the specs as something that the compiler/test harness will check for you. - Write the code.

The main differences being, of course, that you'll typically get your compiler error messages much faster than your test error messages (typically while you're writing the code), which makes them easier to fix - on the other hand, in pretty much all languages, the type system isn't expressive enough to represent all the specs that can be represented as tests.

In other words, in the current state of the world, we need both.

hitchdev · on Oct 28, 2023

They're attacking the problem from two directions - bottom up and top down.

I view the type system as a way to whittle down the state space to the bare minimum while the tests validate that that the behavior of that whittled down execution space matches what is required.

I think investment in a type system to replace tests very quickly reaches a point of diminishing returns and vice versa. A really good type system won't ever beat ok tests + an ok type system and vice versa.

Yoric · on Oct 28, 2023

We're in agreement, then!

fiddlerwoaroof · on Oct 28, 2023

I’d like to see one of these systems designed around generating test suites for the specification. So, you’d prove your algorithm correct using some inefficient representation (nats for integers or whatever) and then the verifier would detect edge cases and generate a junit test suite for the real implementation.

pawelmurias · on Oct 28, 2023

If you can build a simpler implementation just using random data can help a lot.

aeonik · on Oct 28, 2023

I'm interested in learning Idris2, I thought it was intended to blend the strengths of proof assistants and general purpose programming:

https://github.com/idris-lang/Idris2

Yoric · on Oct 28, 2023

Well, I'd claim that Idris is pretty much that. A general purpose (if highly experimental) programming language with the power of a proof assistant.

Also, I have seen people write proofs in Coq and extract them to compilable OCaml source code.

Buttons840 · on Oct 28, 2023

Dependant languages let you express almost anything in the type system. "The array I receive will contain only valid Sudoku solutions", you can express that and the compiler can confirm it's true at compile time. This is very powerful but requires writing proofs to help the compiler.

I wonder if LLMs could help here? It's a burden to write the proofs, but if LLMs become good enough and flexible enough, the dependant type system might serve as a formal layer on top of a LLM, while the LLM fills in the proofs for us.

A dependant type system could serve as the "world" an AI lives and grows in. Currently LLMs mimic our words without a reality to push back. LLMs can't be wrong, they try to mimic, but they have no hard limitations, whatever word they spit out has only soft rewards or punishments. A type system could be the reality an AI lives and learns in, with a hard ground it can fall on, hard walls it cannot pass, and other hard limitations it can explore in.

ipsum2 · on Oct 27, 2023

I suspect that a Python DSL on top of Lean4 would be hugely popular, similar to PyTorch replacing existing Torch, which was written in Lua.

rtpg · on Oct 28, 2023

I think the biggest jump between interactive theorem provers like Lean "normal" languages is the "interactive" part: Lean or Coq or the like really work well in a specialized IDE mode where you're editing the file and getting instant feedback about the state of your proof. You might be able to get a Python DSL to work, but there would be a lot of tooling work involved to pipe everything back.

You can write proofs in a non-interactive way, but you're signing yourself up for a lot of tedium.

EDIT: A video showing someone working through some simple stuff. You can obviously do a lot of things "in your head" but having the feedback loop is helpful. Though thinking about it more I don't think this precludes a Python DSL entirely. It would just require some smart design to support this form of interactivity. https://www.youtube.com/watch?v=VYFTph2izIs

wslh · on Oct 27, 2023

... also like using Z3 with Python [1].

[1] https://ericpony.github.io/z3py-tutorial/guide-examples.htm

ipsum2 · on Oct 27, 2023

This is quite nice! I wasn't aware of this.

Yoric · on Oct 28, 2023

That sounds... contradictory, isn't it? Python is immensely hostile to static analysis (note that I write this as someone who writes Python for a living, using pyright/mypy). If you remove the ability to perform static analysis from Lean, what's left?

Or do you mean a Python-like syntax on top of Lean4?

Note: Pretty sure that libtorch is written in C++ and CUDA.

staunton · on Oct 28, 2023

Why do you suspect that? (Or was it ironic?)

epgui · on Oct 27, 2023

I have a very negative visceral reaction at the thought. (I am someone who uses python for work, and not for fun…)

b9r5 · on Oct 28, 2023

I thought this post was fun: https://blog.codeminer42.com/overcoming-challenges-and-craft...

armchairhacker · on Oct 28, 2023

tl;dr; Yes. But trying to prove anything beyond very tiny functions (especially things like GUIs and world state) is near-impossible and massive overkill. Also, you'll need to work around termination checking (even if your functions always terminate, proving that is another matter).

Theorem provering languages are cool. Being able to prove the absence of bugs, enforce complex invariants, etc. is cool and useful in certain cases (like cryptography and safety-critical software). But it's also very hard, at least state-of-the-art, so often I feel writing good tests is better.

AnimalMuppet · on Oct 28, 2023

Yeah, to me that's the real question: Can the language of proof assistance be efficiently used for general purpose programming? As you say, right now it's too hard for general use.

deterministic · on Nov 1, 2023

Nope. CompCert is an example of a small team of people writing a proven correct C compiler. seL4 is another. The problem is training. Universities are currently not training CS students to prove code correct. This will change. But slowly.

MichaelZuo · on Oct 28, 2023

Didn't some Australian group manage to prove a simplified kernel?

throwaway17_17 · on Oct 28, 2023

seL4 has a formal correctness prof that was developed by NICTA. The project started in 2006 and the formal proof and artifact was submitted in 2009. The elements of the implementation and resulting qualities present during run time are well defined but are limited to exactly those properties the project chose, like lack of dead and livelocks, arithmetic based exceptions, and buffer overruns.

L4 is less a simplified kernel and more of a microkernel, but it is fully general purpose.

One of the neat things they did was transport memory/resource management to userland and maintain a strict capabilities scheme for ensuring correctness. The other was the cost estimate they claim, which was something like a 60% cost saving for doing the formally verified programming compared to standard ‘high assurance’ programming.

It was a cool project but is not often used to support further pushes into more general systems level applicability of ‘theorem proving’ based programming languages.

RetroTechie · on Oct 28, 2023

https://sel4.systems

Working on a number of platforms, verified on some. Multicore support is an ongoing effort afaict.

An OS built on this kernel is still subject to some assumptions (like, hardware working correctly, bootloader doing its job, etc). But mostly those assumptions are less of a problem / easier to prove than the properties of a complex software system.

As I understand it, guarantees that seL4 does provide, go well beyond anything else currently out there.

MichaelZuo · on Oct 28, 2023

Thanks for the link, that's pretty advanced stuff then.

armchairhacker · on Oct 28, 2023

Also a C compiler (https://compcert.org/). I did exaggerate bit in saying that anything non-trivial is "nearly impossible".

However, both CompCert and sel4 took a few years to develop, whereas it would only take months if not weeks to make versions of both which aren't formally verified but heavily tested.

deterministic · on Nov 1, 2023

It takes years to develop a quality C compiler. I am a professional compiler writer and the difference between a toy compiler and a production compiler is vast.

MichaelZuo · on Oct 29, 2023

It demonstrates that proving a full featured operating system, as complex as Windows 7 for example, is within the realm of possibility. It would just likely take a Manhattan project equivalent, with a truly unlimited budget.

agentultra · on Oct 28, 2023

I think so.

https://agentultra.github.io/lean-4-hackers/

LightMachine · on Oct 28, 2023

We have built an entire real time game in Kind lang...