I wonder if we can make an encryption protocol that cryptographically securely encrypts an arbitrary English sentence into an English passage that sounds like it makes sense but is unrelated to the ciphertext.
2. Given a language model which can generate a choice of multiple possible next-symbols given what has already been written, use bytes from the cypher text to choose between the available options.
For example, using the predictive text options on my iPhone, and treating 0=left 1=right, the cypher text 011100 and the starting symbol “Hi”, I get:
“Hi I have heard from the other”
(Note: I’m fairly sure the iPhone predictive text system is personalised and therefore time-variable, but the general idea still applies if you are in full control of the system).
3. If the other party knows the model and the initial word, they can use an equivalent process to recover the cypher text and put that into the normal decryption routine.
This is the correct answer, although it's worth thinking about what the threat model is.
If the government is just going to force specific companies to add backdoors, then the process above isn't really necessary, you just need a way to install a client that isn't backdoored. If, however, the government is banning the sending of encrypted messages, then you have to hope that a jury doesn't see your long pointless messages as strong evidence of using encryption.
To improve slightly upon the language model example given above, though, I suggest something like this:
> then you have to hope that a jury doesn't see your long pointless messages as strong evidence of using encryption
Don't legal people routinely just take few word sentences and rewrite them into long paragraphs of aforementioned hereinafter notwithstanding including but not limited to senseless nonsense?
So long as it looks normal for you, I suspect you’d be fine.
If I started writing long paragraphs of aforementioned hereinafter notwithstanding including but not limited to senseless nonsense, I’d be really obvious — at least to a human, not sure if current AI would notice me yet.
Hmm, I think a better strategy might be embedding the ciphertext in the fur of cat pictures. Sending lots of cat pictures seems pretty normal for anyone. Might be possible to create a GAN that outputs a synthesized cat picture with a constraint of some ciphertext that can be decoded later. Or simple modulation might just work, if I can convince JPEG to not wreck it.
That is higher bandwidth, but for normal chat apps I would expect randomly applied compression in transit breaking things. Email could work though? And if you’re generating the JPEG or PNG yourself, you can put the cypher text in at whatever level you like, including highest entropy bits of the compressed data.
You’d have to be very careful to seem “normal”, as carelessly doing that can change the entropy in a detectable way even for the least significant bits — the least significant bits saved in something like JPEG is not the sensor noise, it’s the smallest stuff that humans pay attention to.
Maybe something that's aided by an AI that generates English texts(a little similar to your iPhone auto-suggest, but more advanced) so that the sentences are valid and coherent. The recipient would need to know some sort of key/"seed" for the AI, that you'd give them in another channel. I bet something like that would be possible, but the ciphertext would be much larger than the plaintext. Still a fun idea.
I have seen a few hacky implementations of this, many years back. Essentially you use a dumb secrecy technique (every second letter of every third word). Then put your (secyrely) encrypted message as the payload.
The question of interest is "how to generate sentences that allow the most dense insertion of data?".
The best two I saw were:
* Used a copy paste (with link) of tweets / jokes / song lyrics with trite comments around them.
* Used an html formatted email with images embedded. The images were fiddled to hold the bulk of the payload and the surrounding sentences were just to describe the image to give it authenticity.
The funniest was a dirty poem generator based on an oracled (to inject the payload) monte carlo sim. It ised historic dirty letters and all sorts of poem formats.