Hi Mr Presler, I don't know if you realize that you've preached the exact same a...

pron · 2024-05-16T15:47:07 1715874427

Hi. I think that's the normal dynamics when an expert in a particular subject discusses something with laymen. I'm not a researcher, but I have applied a range of formal methods for software verification for about a decade on systems ranging from safety-critical real-time embedded software to distributed databases. I constantly learn from people with a similar or higher level of experience. Even in my comments today you can find me mentioning a technique I only learnt about relatively recently from the people at Antithesis (I'd known about it since FoundationDB, but I didn't know as much about the details of their state exploration heuristics).

It is one of the greatest disappointments of my carreer that that group is still small; even most researchers don't have much experience in the mainstream application of formal methods, plus they each tend to focus on a narrow range of methods. The vast majority of practitioners care about correctness, but very few explore approaches other than simple testing or use of simple type systems; even something as basic and as old as non-trivial assertions -- which is a form of formal specification -- is underutilised.

BTW, a complicating factor is that verification techniques often don't extrapolate well from one application domain to another; a method that is effective for cryptography may not be effective for distributed systems and vice-versa.

littlestymaar · 2024-05-16T16:57:59 1715878679

> I think that's the normal dynamics when an expert discusses something with laymen in a particular subject.

But this is hubris, or at least arrogance, on your part: this is a very broad subject with many impact (reliability, security[1], language UX, teach-ability, productivity impact in heterogeneous teams, and so on), and most of the people who've debated with you are in fact often more experts than you in at least one aspect of the topic. Your obstinacy to see them as mere “laymen” is what makes you unable to improve in your understanding of the big picture, which come by aggregating the wisdom that comes from experts of complementary topics to yours.

Your inability to do so is likely the reason for your intellectual stagnation.

And this kind of antagonistic posturing is actually a problem for the whole world, because as you are proudly arguing about the “failures of sound methods” and the success of the alternative, the truth is that even if the real world deployment of TLA+ far exceeds the one of Coq, it's still pretty negligible in the grand scheme of things. And arguing that “but distributed systems do use TLA+ all the time” would be as pointless as saying “but Airbus is using Coq to make planes fly”.

In fact, none of the state of the art methods have gained widespread use, and this industry as a whole keeps pumping out tons of broken software every day without any sign of changing course.

The only successful (in terms of adoption) means of reducing the numbers of bugs are the following:

- static typing [2]

- unit testing

- fuzzing

All of them are pretty bad at their job in average, but at least together they avoids at a billion dollar worth of bugs every years, which is still much better than any of the technically superior, but practically unheard of but most software developers, alternatives. The jury is still out on what can actually work at scale.

Preaching for years on internet forums that your option of choice is the good option, despite its lasting failure to get mainstream adoption is as delusional as the posture you are criticizing, and at least those people are trying something instead of losing their time being patronizing over the internet.

[1] security is very different from reliability in that it has an adversarial nature: it doesn't matter if a bug is “less likely than hardware failure” on random input, if an attacker can trigger on purpose it with high probability. This is for instance something you have been consistently overlooking over these years.

pron · 2024-05-16T18:52:14 1715885534

TLA+ and Coq (at least as far as their use in software goes) are both examples of formal specification languages offering a deductive proof systems (and TLA+ also offers a model checker -- yet another sound method). And I don't think you'd find me selling either one of them in this conversation, in which I mostly focus on the limitations of such sound tools.

Unit tests (depending how you interpret the properties of interest) and fuzzing are indeed probably the most famous examples of unsound methods, and so it sounds like you actually agree with me: it is unsound methods that have been more successful in practice except for the simple properties proven by simple type systems. But no matter what argument you find more or less convincing, you should try to listen and learn -- because it's clear that you don't understand the conversation -- rather than shout from the peanut gallery.

If you find the terminology unfamiliar, you can think of sound methods as "static" ones [1], i.e. they don't execute the program, while unsound methods are usually "dynamic", i.e. require executing the program [2]. Indeed, my point is that "static" and "dynamic" methods are most effective when used in combination, and it is the static methods that are more cost effective for simple properties -- their effectiveness drops as the properties become more "interesting" -- and it is the dynamic methods that are more cost-effective for more complex properties. It sounds like you agree.

Now perhaps you can understand my original point: a certain attribute of a programming language that makes certain sound (static) methods easier when verifying certain classes of properties doesn't help all that much if those methods are so limited to begin with for that class either in the range of properties for which they're effective or their scalability. I.e. making a method that, in the relevant situations, is 100x too expensive for your budget only 90x more expensive for your budget doesn't have much of an impact on its overall practicality.

[1]: This is more of a large overlap than an equivalence. A human or LLM reviewing code is also a static form of correctness assurance, but it isn't a sound one.

[2]: I won't confuse you with abstract interpretation and model checking, both of which are sound and are can be viewed as either static or dynamic depending on your perspective. I consider them static as they don't rely on full concrete executions of the program. Some interesting unsound methods like concolic testing have some "static" parts, but I consider them ultimately dynamic.

littlestymaar · 2024-05-16T19:44:58 1715888698

I only mentioned Coq vs TLA+ because I've read your blog and I know you have strong taste preferences between those two.

That's because, unlike you, I actually try and understand people I don't agree with and see where they come from and under what perspective their views are indeed interesting, and that's how you learn how to understand complex (not difficult) topics.

But that, my friend, requires that you don't immediately consider yourself superior to your interlocutor and first consider them like peers who have different perspective on things because they have a different background. That's the inevitable conditions for improvement.

Thinking about other as mere peasants is probably more intellectually comfortable, but it's also far less constructive. Needless to say, it's also not a great way to earn respect.

DEADMINCE · 2024-05-17T12:17:46 1715948266

Your lecturing here seems to be making a lot of insulting assumptions unnecessarily. It's probably best to try and avoid giving 'advice' like this.