Something I don’t understand about homomorphic encryption

Groxx · on July 31, 2011

1) computing arbitrary polynomials is all you can do to data. Everything you do is somehow adding two numbers together, subtracting, multiplying, or dividing. At the very very worst, this is all your computer is doing - you could implement a computer on top of the encryption scheme. It would be horrendously wasteful, but it would work, and that's what really matters in all this. Improvements will come, or computers will speed up.

2) >The result I get is still encrypted, but since there are only 128 possible values for an ascii character, I can encrypt each of these values and compare the result to the result of my computation.

And that's why keys of one byte are a bad idea. Bad in symmetric, bad in asymmetric, just plain bad. This same attack is true for any block-level encryption - it doesn't make them insecure. And homomorphic encryption needs, if I remember right, enormous relatively-prime numbers. Like, gigabytes worth. We're not talking small keys here.

3) The most dangerous part about using homomorphic encryption is that, necessarily, how you operate on the data reveals something about the data (and the search). Say you're Google, and you repeatedly see sequential access on blocks 123-345, 346-356, and 357-789. You can make a pretty strong claim that those are separate blocks of data that have been encrypted. The more you do to your encrypted data, the more someone else gets a peek at what you're doing. A very obfuscated peek, but a peek nonetheless. Add enough peeks, and you can get quite a bit of information. Or try something like a search through a binary search tree - the early locations you access will be touched by many processes, in a linear fashion, always with branching behavior. What kind of data does it look like?

But there are ways to combat this as well. If you're going to access 123-345, grab pieces out of it and others at random, overlap the edges, and split it across multiple operations. On the receiving edge, just toss out everything extra. All that can be seen is that you accessed a lot of data, but they don't know which ones were important (unless you were truly random, and then statistically the edges of that block can be seen).

bdhe · on July 31, 2011

And that's why keys of one byte are a bad idea. Bad in symmetric, bad in asymmetric, just plain bad. This same attack is true for any block-level encryption - it doesn't make them insecure. And homomorphic encryption needs, if I remember right, enormous relatively-prime numbers. Like, gigabytes worth. We're not talking small keys here.

This is incorrect. The key size has nothing to do with extracting one byte. As mentioned later in the thread, if the encryption is not randomized (and Craig's scheme is, like all "semantically-secure" encryption schemes) then the ciphertext leaks no information about the plaintext even if it were only one bit. However, if the encryption is not randomized then trivially, your encryption is only as secure as your plaintext space.

Secondly, you say homomorphic encryption needs enormous relatively-prime numbers. This is not correct. Craig's sceme uses lattice-based cryptography. The keys are long because we don't have efficient reductions to hard problems and the best algorithms to solve lattice problems currently known are better than brute force (so we need to increase parameters to make the best algorithms known today take at least 2^80 time, if not more).

Groxx · on Aug 1, 2011

I'll have to look into lattice-based crypto some time, I know essentially nothing about how it works. Thanks for the corrections!

tedunangst · on July 31, 2011

It's not the key that's one byte, it's the plaintext. But it makes no difference. The author/attacker doesn't have the key. He can encrypt all the letters he likes, but getting identical ciphertexts doesn't mean anything unless he used the same key.

emily37 · on July 31, 2011

Why doesn't the attacker have the key? It's a public key cryptosystem, so we should assume the attacker has the public key.

The attacker can encrypt all 128 ascii characters under the public key that was used to encrypt the whole string. The fact that the author was missing (which was pointed out later) was that two encryptions of the same byte under the same key will produce different ciphertexts.

tedunangst · on Aug 1, 2011

I think I may have failed to understand something important. :(

tzs · on July 31, 2011

> computing arbitrary polynomials is all you can do to data. Everything you do is somehow adding two numbers together, subtracting, multiplying, or dividing.

What about conditional branching?

archgoon · on July 31, 2011

Don't think in terms of assembly instructions, think in terms of logic gates.

A nand gate can be represented as a polynomial (over a binary ring) as

result = (x+y)*(1-xy) = x+y-xy

Groxx · on Aug 1, 2011

Apologies all, replace key size with block size in the 2nd point. Not sure why I was thinking this is key related.

andrewcooke · on July 31, 2011

am i being dumb? what does the size of the key have to do with knowing that there's only a small range of possible values (and then encoding each and checking for equality)? isn't this more likely addressed by random ivs (or whatever the equivalent is)?

simcop2387 · on July 31, 2011

A larger key size will mean that you'd have to look at more than just one byte to begin saying anything about what the data could be. This means that even though there are only 128 (normally far far lower for a password) values that each byte COULD be, there would be something like 2keysize-7 possible representations of each of the possible values. This means that an attack like the one described in the blog article would be significantly more difficult.

andrewcooke · on July 31, 2011

i don't think so. the attack is extracting a single byte. so without any random iv you will get one of 256 values spread across whatever the block size is. that size doesn't matter - you still have just 256 values to test for. your argument only applies if the rest of the data are appended to the plaintext. [disclaimer: i'm no cryptographer, just using simple logic here]

simcop2387 · on July 31, 2011

My understanding is that along with the addendum added to the post (having more than one possible representation of the same plain text for a given cypher), you also wouldn't be able to perform a re-encryption attack similar to how he describes. This is simply because to do that you would have to have the encryption key on the untrusted data center. Homomorphic encryption instead allows you to operate on the data without having either the encryption key or the decryption key (if it's an asymmetric cypher).

robinhouston · on July 31, 2011

If you can perform arbitrary computations on the encrypted data (yielding encrypted results) then I think you can take advantage of that to encrypt whatever you like.

The idea is to evaluate a constant-valued function “inside the encryption”, e.g. the function that ignores its input and always returns ‘A’. The result of that will be an encrypted version of ‘A’.

Is there something wrong with that argument?

pelotom · on July 31, 2011

This argument is valid... the flaw in the author's logic is rather the assumption that encryption is a function, i.e. that it is deterministic. In fact decryption is deterministic, but encryption is not, and that's what prevents this kind of attack. Yes, you can obtain an encryption of any given ASCII character using a constant-valued function "inside the encryption", but the important point is that it's just an encryption, not the encryption, of that character. The decryption function, which you do not have access to, will send all possible encrypted representations back to the same plaintext.

tedunangst · on July 31, 2011

You, as the person/computer performing the computation, do not get to write your own functions to run "inside the encryption".

ColinWright · on July 31, 2011

Please note that I submitted this, I am not the author. Please address comment to the author, not me. I get disturbed when people think that items I submit were written by me, and address their comments to me.

I just think this is an interesting question, and I thought the HN audience/community would also find it - and any ensuing discussion - interesting.