Solving SAT via Positive Supercompilation

inasio · 2024-02-01T17:53:25 1706810005

There are many very simple heuristics for solving 3-SAT, some even with analytical convergence estimates. A simple random walk (Algorithm P in Knuth's Satisfiability fascicle), which iterates through the clauses and flips a variable at random if it encounters a violated clause, will do the trick too

kazinator · 2024-02-02T07:31:25 1706859085

Conversion of an NFA graph to DFA (in regex processing) seems to be a kind of supercompilation. The NFA is treated as a process, and is simulated for all possible input symbols at every step. The supercompiler makes note of what states it goes into and turns sets of them into DFA states.

gbacon · 2024-02-02T18:40:44 1706899244

Without backreferences, it has a worst-case exponential blow up due to the DFA possibly needing to simulate that many simultaneous NFA states.

Regex matching with backreferences is NP-hard.

- https://perl.plover.com/NPC/NPC-3SAT.html

- https://perl.plover.com/NPC/NPC-3COL.html

arcastroe · 2024-02-01T18:21:53 1706811713

Ah, should've double checked before posting. Thank you everyone who corrected me. Leaving it up for others.

> SAT is an NP-complete problem, meaning that it is at least as hard as any other problem in NP

I think this statement is backwards. Instead, any other problem in NP is at least as hard as SAT

amluto · 2024-02-01T18:24:00 1706811840

NP-complete means that a problem is both NP-hard (it's as hard as anything else in NP) and is in NP (with all that membership in NP implies: there is a polynomial time witness and verifier for a "yes" answer, it's at most polynomially harder than any NP hard problem, etc.).

tsimionescu · 2024-02-01T18:26:34 1706811994

Nope. For example, binary search is in NP (because it is in P), but is much easier than SAT.

JohnKemeny · 2024-02-01T22:30:13 1706826613

To be slightly nitpicky, it is problems that belong to complexity classes and not algorithms. Binary search is an algorithm that solves searching in a sorted list.

Funnily, although we usually think of it as having complexity O(log n), that does not hold true for the Turing machine model, with which the complexity classes P and NP are defined.

U2EF1 · 2024-02-02T02:54:19 1706842459

Join the rest of us in post-1950's complexity theory. We can use register machines now. Or you can bust out the big bucks and buy a RASP.

refulgentis · 2024-02-02T00:52:56 1706835176

Wait:

` binary search is in NP (because it is in P),`

P = NP!? Casually proved in a HN comment?

dataflow · 2024-02-02T01:04:31 1706835871

It's well known that P is a subset of NP.

refulgentis · 2024-02-02T01:28:51 1706837331

If you're a fellow dropout like me, context:

"P" is the class of problems that can be solved by a deterministic Turing machine in polynomial time. "NP" stands for "nondeterministic polynomial time," and is the class of problems for which a solution can be _verified_ in polynomial time by a deterministic Turing machine, given the correct "hint" or "certificate." Every problem in P is also in NP, since if a problem can be solved in polynomial time, its solution can certainly be verified in polynomial time (just solve the problem again).

crznp · 2024-02-02T04:12:40 1706847160

> (just solve the problem again).

Certainly P is a subset of NP, but is this statement always true?

Consider a Galton board where a marble is dropped into an array of pegs. At each peg, it can go left or right at random (assume it never bounces further). After some number of choices, it falls into a slot.

Finding a solution (a slot that a marble can fall into) is easy: drop one marble. But it could take many marbles to verify that a marble can fall into a given slot.

As described, that is easy: all left choices, all right choices, everything in between. But imagine that the choices aren't fair, directly observable, or linear (LR != RL). Then it gets tricky.

I've never thought about non-deterministic problems of that sort in the context of computational theory, but it isn't uncommon in nature.

strangetortoise · 2024-02-02T07:47:03 1706860023

To make meaningful statements about P and NP problems, you probably need to be able to model your problem as a computational one.

If for example, you model it as a graph, with a peg represented by a node, and a bounce direction to another peg with a directed edge. Assign probabilities to all the edges. Then you cqn simply flood search outwards to the end of the graph, accumulating probabilities by multiplication. Any node on the boundary with probability higher than zero can be reached, given enough balls. This job is clearly in P.

I'm not sure it makes sense to talk about "P" for physical problems without a computational model, but I'm not a complexity theorist

Alpha3031 · 2024-02-02T09:02:57 1706864577

Strictly speaking, the fundamental complexity classes (in order: L, NL, P, NP, PSPACE, EXPTIME, NEXPTIME, EXPSPACE) apply only to decision problems, which pose yes/no questions. For example, it's commonly said that the Traveling Salesman Problem is NP-complete, but this only applies to the decision version of the problem: "Given graph G with edges weighted with positive integers and a maximum weight n, does there exist a Hamiltonian cycle through the vertices such that the sum of the weights of the edges used is at most n?" In addition, for a problem to be in NP, the polynomial-time-verifiable certificate only has to exist for yes instances, there is no need for it to also be verifiable that no such cycle exists (a problem verifiable in polynomial time for the no cases is in co-NP).

Decision problems are much easier to reason about, which is why they are used to construct and define the fundamental complexity classes, though it is certainly possible to define analogous classes for different types of problems. See for example, Krentel (1988) "The complexity of optimization problems", which considers TSP-OPT: https://doi.org/10.1016/0022-0000(88)90039-6

zaik · 2024-02-02T02:20:42 1706840442

So we're already halfway done with the proof.

aleph_minus_one · 2024-02-02T10:33:40 1706870020

> So we're already halfway done with the proof.

Indeed the Cook-Levin theorem

> https://en.wikipedia.org/wiki/Cook%E2%80%93Levin_theorem

which actually proves that there exist NP-complete problems in NP is not a trivial result (and was at that time a highly surprising one).

So, indeed, "one half of the proof" has been done; the other half (which seems to be much harder) of P vs NP is still open.

margalabargala · 2024-02-02T06:33:27 1706855607

The parent is implying the inverse, that NP is contained within P.

dataflow · 2024-02-02T06:36:19 1706855779

I believe you're misreading the comment? It says "binary search is in NP (because it is in P)"... i.e. the fact that binary search is in P implies that it is in NP. Which is true because P is a subset of NP.

margalabargala · 2024-02-02T17:24:50 1706894690

Whoops, you're totally right. I swear I re-read that three or four times before commenting and utterly failed to parse properly.

dataflow · 2024-02-02T17:31:27 1706895087

Happens to me all the time :)

nyrikki · 2024-02-01T21:10:16 1706821816

Thinking of this way helps me.

NP-complete is the intersection of sets NP and NP-hard , but not the part of NP that contains P.

Lots of NP-hard problems are not in NP is important.

gbacon · 2024-02-02T18:43:00 1706899380

See this helpful diagram:

https://en.wikipedia.org/wiki/NP-hardness#/media/File:P_np_n...

zellyn · 2024-02-01T19:25:36 1706815536

I'm almost completely clueless with both 3-SAT and supercompilation, but I'm willing to bet that if you found a neat trick that solves 3-SAT problems easily, then either or both (a) there are classes and/or sizes of 3-SAT problems for which your solution breaks down badly (goes exponential), or (b) what you're doing reduces to a 3-SAT strategy that the people who enter 3-SAT contests[1] already knew about a long time ago.

[1] http://www.satcompetition.org/

hirrolot · 2024-02-01T20:06:46 1706818006

The solution has exponential time and space complexity, which I noted in the blog post. It is only interesting for its simplicity and as a stress test for supercompilers.

zellyn · 2024-02-01T20:29:38 1706819378

Ah, I missed that. Makes perfect sense! By the way, if you're interested in the "this problem can blow up badly because it can be converted to 3-SAT", you might like https://research.swtch.com/version-sat

CobrastanJorji · 2024-02-01T19:55:56 1706817356

3-SAT is NP-Complete, so either (a) is correct or else P=NP.

zellyn · 2024-02-01T20:33:49 1706819629

Touché! I guess I was thinking that there are often sub-classes of NP-complete problems that are amenable to much more efficient algorithms, and it might be easy to accidentally not try any hard ones. Turns out the author fully acknowledged exponential explosion, which I would have known had I read Final Words section instead of skimming everything after the intro!

gbacon · 2024-02-02T19:09:28 1706900968

3-SAT is at least as hard as SAT, or SAT ≤p 3-SAT. We know this because we can translate arbitrary SAT instances into 3-SAT in polynomial time. 3-SAT is in NP because we can check a candidate solution in polynomial time by plugging in the values and evaluating.

More accurate than “an NP-complete subset of SAT” would be to say that SAT is polynomial-time reducible to 3-SAT or that we can solve SAT in terms of 3-SAT.

Combining both of these, each is reducible to the other. In some sense, they are the same problem. 3-SAT imposes just enough structure to force it into NP-complete. 2-SAT is in P.

NooneAtAll3 · 2024-02-01T21:58:50 1706824730

sad that SAT competition still hasn't moved from 00s and still is on http

SAT conferences aren't on youtube either :(

pixelpoet · 2024-02-02T05:25:00 1706851500

I love that the favicon for the 2023 results is a Doge, lol: https://satcompetition.github.io/2023/results.html

hirrolot · 2024-02-02T06:56:55 1706857015

Haha, awesome!

theGnuMe · 2024-02-01T19:17:46 1706815066

Is this just compiler optimized memoization? I mean this is great for parallelization as well but I thought this was standard in the functional programming paradigm. Maybe I am wrong.

whitten · 2024-02-01T19:43:29 1706816609

As I recall, memorization usually involves associating the results of calling a pure function with the particular input values so the computation of the function is reduced to a lookup operation.

The calculation of the pure function is still used for any other combination of input values.

Supercompilation looks at the unevaluated symbolic recursive code and replaces the code in some cases with simpler, specialized code tuned to the actual input values, so does a memoizing not only of the result, but the code used to calculate the function to begin with.

hirrolot · 2024-02-01T20:08:45 1706818125

There are currently no production-grade compilers that make use of supercompilation in the functional world. Memoization is about caching results of pure functions; supercompilation is about deep program transformation.

tgv · 2024-02-02T07:18:01 1706858281

Any idea why there aren't? Not even to a small degree? To me, it looks like fairly straight-forward rewriting. Perhaps it's hard to tell how much is useful?

> deep program transformation

It doesn't have to be deep, does it? You can't transform the entire program, so there's an arbitrary cut-off anyway.

rowanG077 · 2024-02-02T09:18:27 1706865507

I think there are a few big problems. Performance is a big one, code bloat is another. You also need a very good termination heurestic or there will be edge cases where compilation time spins out of control.

Here is some discussion here on why a pretty serious for GHC stalled: https://www.reddit.com/r/haskell/comments/2s97d0/the_state_a...

hirrolot · 2024-02-02T07:49:39 1706860179

Supercompilation acts as global program optimization. As far as I'm aware, there were no attempts to supercompile separate libraries.

In fact, supercompilation subsumes many common optimizations, such as function inlining, constant propagation, list fusion, etc. So we can say that it's actually used "to a small degree", although I wouldn't really call it "supercompilation" per se.

The reason that full-blown supercompilation is not used is that it's too unpredictable -- the direct consequence of being too smart. I've elaborated on it in the final section of the blog post.

rurban · 2024-02-05T07:17:59 1707117479

I thought the SPARC part of Ada did that