CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher

xrd · 2024-06-16T01:11:48.000000Z

Great article.

Can someone smarter than me explain why an embedding is or isn't a poor man's form of encryption? You can retrieve the closeness to something using cosine similarity, but you can't go back to the original source (unless you store the embedding side by side, which is what most people do with those vectors).

But, isn't using an embedding model a good way to cloak your original data and still make it comparable?

tripplyons · 2024-06-16T02:04:30.000000Z

Text embeddings should not be used as a form of encryption. They are optimized to contain as much information as possible from the text they represent and can often be decoded into their input: https://arxiv.org/abs/2310.06816

xrd · 2024-06-16T02:13:49.000000Z

Great answer! And your blog is great, too!

diwank · 2024-06-16T05:41:49.000000Z

Yep and embeddings can be decoded back into meaningful text if you have the model weights and the tokenizer.

didgeoridoo · 2024-06-16T02:18:07.000000Z

Seems like a better comparison is to a one-way hash function.

Given a set of vectors resulting from an embedding model, it’s cheap & easy to check if a certain document is similar to the original source, as you just run the embedding on the comparison document and choose your favorite similarity metric.

However it’s very hard to recreate the source itself — as far as I can tell, you’d basically have to run a very expensive form of blind gradient descent: generate multiple texts, run embedding on them, check similarity, pick the closest one, and repeat with that text as the starting point.

Maybe someone can correct me if there exists an efficient way of reconstructing the original text. I would be very interested to know.

Edit: I appear to have just described a super naive and slow implementation of vec2text (see sibling comment). Will have to read that paper in more detail but it looks really cool.

diwank · 2024-06-16T05:44:58.000000Z

It is possible to reconstruct the text from an embedding but also more importantly, I believe that by embedding, the OP means sending the computed embedding matrix instead of the input. Which is a simple matrix inversion problem.

Computing output embeddings is different.

fsmv · 2024-06-16T16:47:47.000000Z

But this cypher is easily defeated, it even says how in the Wikipedia article.

I suspect it wouldn't work with anything that actually prevents people from "decrypting" it.

diwank · 2024-06-16T17:38:30.000000Z

Yes of course but this was just to validate the hypothesis that FMs and their tokenizers can generalize on a fully encrypted dataset.

Now that the approach is clearly viable, I am going to extend it to use a modern cipher like XChacha20 which is a gold standard.

diwank · 2024-06-16T17:39:35.000000Z

Just to add- not easily defeated. I am even hosting a bounty for whoever can break this one (the model weights and tokenizer are available on huggingface) :D

tripplyons · 2024-06-16T02:00:30.000000Z

How does this preserve privacy when everything can be decrypted?

diwank · 2024-06-16T05:47:29.000000Z

Also to clarify- privacy preserving here means protecting from the model inference provider. Sort of like, if openai trained a GPT-4 using this scheme and gave the government the key. Then the gov can use it safely even while it’s hosted on OpenAI’s servers and OpenAI on the other hand does not need to share the model weights with the gov

tripplyons · 2024-06-16T21:58:42.000000Z

Thanks for the clarification, this makes a lot more sense now!

diwank · 2024-06-16T05:40:56.000000Z

It can only be decrypted using the encryption key. The inputs are encrypted and look like a gibberish and so are the outputs.

It preserves privacy by essentially making the inputs and outputs unreadable.