Can you guard against forged outputs? If you give me a program ciphertext and the public keys, I can simply encrypt a bogus answer and hand it back to you...
...but, if you interleave your Brainfuck program with, say, (classical) encryption of a canary constant, then you can check the canary constant when you decrypt the result, and you have high assurance that the computation is genuine - an attacker couldn't distinguish canary operations from real code, though you'd have to scatter the output bytes uniformly in the output. This scheme is a bit like obfuscation inside FHE.
- Is there a more efficient way to securely tamper-proof your results?
- How robust is this (or your own) scheme against a small number of adversarial bit flippings?
(Note that adversaries can mutate arbitrary bits via BlindRotate() in NuFHE/other TLWE schemes, they just can't extract the plaintext value (though they can set it, for example by NANDing.)
"Fully homomorphic encryption" and "Brainfuck" in one sentence. I guess now I've seen everything :P but nice idea (obviously the language doesn't matter that much if it's a VM). Did some fully homomorphic $stuff for my bsc thesis,... Was this for a thesis as well, or just for fun? I guess next step is to use BF instead of Python? ;)
Just for fun. I am taking it seriously, but I have no other motives other than to learn. This tech has serious potential to fix a lot of privacy problems. I thought it'd be good to have this out there.
Could you elaborate on the practical applications? I find this sort of thing very interesting, but it seems to get computationally infeasible once you apply it to any sort of real-world problem.
Well, this project as it currently is makes no sense to use for anything real. What this project is trying to show is that ANY computation is possible with this.
So we don’t necessarily have to create an entire VM to, say, run a FHE machine learning model. That can be built directly and specifically optimized for the task. I think that’s where this is currently useful for real world stuff.
I went reading the linked intro to Fully Homomorphic Encryption [1]
>This means you can multiply together any two Elgamal ciphertexts, and upon decryption you’ll find that the (single) resulting ciphertext now embeds the product of the two original plaintexts. Neat!
> Since the program essentially has zero context on what it is computing, it has to run through every possible branch of the program. On top of this, the underlying encryption schemes that make FHE possible are very computationally expensive.
So... efficient computing with homomorphic encryption should focus on branch-free programming on CISC architectures with a huge number of instructions?
If I understand homomorphic encryption right, in CISC case it will go through all possible instructions instead of all possible branches, so no advantage here.
From repo:
> This is because at every cycle, the VM has to go through every possible instruction on every possible data cell.
So cool / futuristic. A guy once wrote an open source JVM in VHDL. I've always wondered how hard it would be to port something like that on top of an underlying FHE NAND gate implementation. Edit: Link - https://www.jopdesign.com/
I have been thinking if this type of technology could be used to implement a mix mailer for messages, that is user aware so that online / offline correlation attacks, or compromised nodes in overlay networks (TOR etc) can be prevented.
How is homomoprhic encryption secure? If you can do math on the data you can easily figure out an 8 bit ascii character especially if it's a prime number. I'm probably not understanding it properly.
The "run" function creates a key, encrypts the program, passes the encrypted VM state to the VM (which has no knowledge of the key), and finally decrypts the result.
Wow, thanks for the response, that's really neat. I'm still confused though. Is this VM running every possible program ever (branching N times every step, given an instruction set of size N?) And then the run function just picks the right result? Thanks again.
We have a couple of variables: the data pointer, the instruction pointer
The values of these are encrypted BUT we can change them by performing logical operations (AND, OR, XOR, etc) on them.
So a cycle of the VM asks
1. Is the current instruction the correct instruction?
2. Is the current data cell the correct data cell?
Obviously if we knew the value of the instruction pointer we could just go straight to the correct instruction, and if we knew the index of data cell I could go straight to it.
The problem is that we don't know either of these things. So we have to go to every instruction and every data cell and check if it's the correct one. If it is then we can perform the instruction, otherwise we move on.
You can see now why this would be slow too. It's even slower though because in order to do this "check" we have to execute logic gates and the gates can take ~0.8 seconds (on a somewhat modern CPU).
Arbitrary programs that terminate on every input (no infinite loops), and with a few important caveats:
1. Random access (reading input-dependent memory address) must be simulated with linear scans of the array.
2. Input-dependent branches must be transformed as follows: both branches are evaluated, and the side effects must be combined using the branch condition (e.g. x <- xc + new_x(1-c)).
3. Loops with input-dependent bounds must be unrolled to their maximum number of iterations (this is not always easy to compute).
In general you would probably write your programs in a special-purpose language that is easy to compile for this model of computation. Efficiency can suffer for some algorithms -- sorting winds up being O(n lg^2 n) instead of O(n lg n), for example.
Also the best FHE schemes are still nowhere near practical for the majority of computations people care about.
I didn't realize HE was still so primitive. Whenever there is a privacy kerfulffle for thing X, HN commenters say "X should be implemented with homomorphic encryption instead!"
Yeah, it has a very small number of instructions and is still Turing Complete. The Brainfuck code gets compiled to bytecode so I guess technically I could have used anything. The underlying VM was created with BF in mind though.
...but, if you interleave your Brainfuck program with, say, (classical) encryption of a canary constant, then you can check the canary constant when you decrypt the result, and you have high assurance that the computation is genuine - an attacker couldn't distinguish canary operations from real code, though you'd have to scatter the output bytes uniformly in the output. This scheme is a bit like obfuscation inside FHE.
- Is there a more efficient way to securely tamper-proof your results?
- How robust is this (or your own) scheme against a small number of adversarial bit flippings?
(Note that adversaries can mutate arbitrary bits via BlindRotate() in NuFHE/other TLWE schemes, they just can't extract the plaintext value (though they can set it, for example by NANDing.)