I wouldn't say it's a TLA+ alternative because it cannot do the most powerful and useful things TLA+ does (esp. refinement), but it is an alternative for programmers who just want to specify at a level that closer to code and model-check specifications.
Every time I see a new TLA+ replacement my first thought is "Oooh this will be good for the 99% of normal stuff people do with TLA+."
Then I look through some of the specs I've written with clients and find the one absolutely insane thing I did in TLA+ that would be impossible in that replacement.
I believe this is really the tragedy of formal verification tools. Everybody wants a tool as robust as a compiler. At the same time, nobody wants to invest into development of such tools. Microsoft Research 20 years ago was probably an exception to that. The other companies wish to immediately hide these tools and the benchmarks behind the IP and closed source. As a result, we have early stage MVPs that are developed by 1-3 people.
When you say peer reviews, do you mean academic publications or testimonials? I imagine it would be difficult to publish a paper at an academic conference proposing an alternative syntax for anything, even if it were better.
- Helps you understand complex abstractions & systems clearly.
- It's extremely effective at communicating the components that make up a system with others.
Let get give you a real practical example.
In the AI models there is this component called a "Transformer". It under pins ChatGPT (the "T" in ChatGPT).
If you are to read the 2018 Transfomer paper "Attention is all you need".
They use human language, diagrams, and mathematics to describe their idea.
However if your try to build you own "Transformer" using that paper as your only resource your going to struggle interpreting what they are saying to get working code.
Even if you get the code working, how sure are you that what you have created is EXACTLY what the authors are talking about?
English is too verbose, diagrams are open to interpretation & mathematics is too ambiguous/abstract. And already written code is too dense.
TLA+ is a notation that tends to be used to "specify systems".
In TLA+ everything is a defined in terms of a state machine. Hardware, software algorithms, consensus algorithms (paxos, raft etc).
So why TLA+?
If something is "specified" in TLA+;
- You know exactly what it is — just by interpreting the TLA+ spec
- If you have an idea to communicate. TLA+ literate people can understand exactly what your talking about.
- You can find bugs in an algorithms, hardware, proceseses just by modeling them in TLA+.
So before building Hardware or software you can check it's validity & fix flaws in its design before committing expensive resources only to subsequently find issues in production.
Is that a practical example? Has anyone specified a transformer using TLA+? More generally, is TLA+ practical for code that uses a lot of matrix multiplication?
It’s really not, TLA+ works best for modeling state machines with few discrete states and concurrent systems. It can find interesting interleaving of events that would leave to a violation of your system properties
> What Formal Specification Is Not Good For: We are concerned with two major classes of problems with large distributed systems: 1) bugs and operator errors that cause a departure from the logical intent of the system, and 2) surprising ‘sustained emergent performance degradation’ of complex systems that inevitably contain feedback loops. We know how to use formal specification to find the first class of problems. However, problems in the second category can cripple a system even though no logic bug is involved. A common example is when a momentary slowdown in a server (perhaps due to Java garbage collection) causes timeouts to be breached on clients, which causes the clients to retry requests, which adds more load to the server, which causes further slowdown. In such scenarios the system will eventually make progress; it is not stuck in a logical deadlock, livelock, or other cycle. But from the customer's perspective it is effectively unavailable due to sustained unacceptable response times. TLA+ could be used to specify an upper bound on response time, as a real-time safety property. However, our systems are built on infrastructure (disks, operating systems, network) that do not support hard real-time scheduling or guarantees, so real-time safety properties would not be realistic. We build soft real-time systems in which very short periods of slow responses are not considered errors. However, prolonged severe slowdowns are considered errors. We don’t yet know of a feasible way to model a real system that would enable tools to predict such emergent behavior. We use other techniques to mitigate those risks.
Formal methods including TLA+ also can't/don't prevent or can only workaround side channels in hardware and firmware that is not verified. But that's a different layer.
> This raised a challenge; how to convey the purpose and benefits of formal
methods to an audience of software engineers? Engineers think in terms of debugging rather than ‘verification’, so we called the presentation “Debugging Designs” [8]
. Continuing that metaphor, we have
found that software engineers more readily grasp the concept and practical value of TLA+ if we dub it:
Exhaustively testable pseudo-code
> We initially avoid the words ‘formal’, ‘verification’, and ‘proof’, due to the widespread view that formal
methods are impractical. We also initially avoid mentioning what the acronym ‘TLA’ stands for, as doing
so would give an incorrect impression of complexity.
Isn't there a hello world with vector clocks tutorial? A simple, formally-verified hello world kernel module with each of the potential methods would be demonstrative, but then don't you need to model the kernel with abstract distributed concurrency primitives too?
Formal specifications benefits are clear and I think well understood at that point. If you want to ensure that your specifications is coherent and doesn’t have unexpected behaviour, having a formal specification is a must. It’s even a legal requirement for some system nowadays in safety critical applications.
The issue of TLA+ is that it doesn’t come from the right side of the field. Most formal specifications tools were born out of necessity from the engineering fields requiring them. TLA+ is a computer science tool. It sometimes shows in the vocabulary used and in the way it is structured.
https://quint-lang.org/