Hacker News new | past | comments | ask | show | jobs | submit login

eBPF is specifically not Turing-complete because you can not prove any property of Turing-complete language code.



The bytecode format of eBPF is Turing complete, it's the verification pass that makes sure the code flow graph is a DAG for instance.

cBPF was different, and only allowed you to jump forward (guaranteeing termination), but eBPF is a much more general VM.


eBPF is not Turing-complete, but static analysis can absolutely prove things about individual programs in Turing-complete languages.


You can prove some code in Turing complete languages.

You can construct non-turing-complete languages which all code can be proven, and that is the point the parent is making.


Forbidding loops in a language does make all code written in that language provably terminating. But there’s not much else it lets you prove for all code, at least not if you want to get results before the heat death of the universe. For example, you can’t determine what inputs to a program will yield a given output: even classic BPF is probably expressive enough to implement a cryptographic hash, and eBPF definitely is. You can’t enumerate all possible paths through the program: the number of paths is exponential in the size of the program.

On the other hand, forbidding loops does make some properties easier to prove for restricted classes of programs. For instance, Linux’s BPF verifier tracks, for each instruction, the minimum and maximum value of each register at that point in the program. It uses that to determine whether array accesses in the program are bounds checked, and complain if not: that way, it doesn’t have to insert its own bounds check, which might be redundant. Doing the same in the presence of loops would require a more expensive algorithm, so forbidding them is a benefit. Yet... Linux’s verifier is sound: it will forbid all programs that could possibly index out of bounds, at least barring bugs. But it is not fully precise: it does not pass all programs that have the property of never indexing out of bounds for any input. For example, you could have a program that takes a 256-bit input and indexes out of bounds only if its SHA-256 hash is a specific value. That program is safe iff there happen to be no 256-bit strings that hash to that value, something that you could theoretically verify – but only by going through all 2^256 possible strings and hashing each of them. Linux does not.

Nor would it be reasonable to, why does that hypothetical matter? Because the ability to prove arbitrary properties about all programs, at the cost of arbitrarily long analysis time, is sort of the main mathematical benefit to non-Turing-completeness. But from a practical standpoint that’s useless. And if you don’t need that – if you only care about the ability to prove things about restricted classes of programs – well, you can achieve that even with loops. After all, that’s what a type system does, and there are plenty of Turing-complete languages with type systems. As I said, disallowing loops makes the job easier in some cases. But that’s more of a matter of degree, not the kind of hard theoretical barrier that “non-Turing-complete = analyzable” makes it sound like. That makes it a less convincing rationale for disallowing them.


Of course one can make proofs of programs written in Turing complete languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: