Hacker News new | past | comments | ask | show | jobs | submit login
Attacks on Encrypted Databases [pdf] (iacr.org)
106 points by blacksmythe on June 16, 2017 | hide | past | favorite | 26 comments



Any good recommendations for running Postgres transparently as a fully encrypted database (not just on a table by table basis)?

Or alternatively is this a problem that's better / just as viably solved with an encrypted filesystem?


> Or alternatively is this a problem that's better / just as viably solved with an encrypted filesystem?

Yes and no. They discuss this on page 2.

> An oft-cited threat to DB security is disk theft, i.e., theft of persistent storage [2,16,20,32,37,42,47,50]. Full-disk encryption (FDE) can mitigate this threat, but EDBs aim to protect data even in the absence of FDE. Without FDE, this attack yields the persistent OS and DB state, but not any volatile state.

Even with FDE, an attacker who can do some sort of software exploitation can still grab all the data they need out of the filesystem.

It's worth reading the whole paper if you're interested in the topic. It's only 6.5 pages of content, and it's written in a way that makes the technical stuff pretty accessible to a general CS audience.

> Any good recommendations for running Postgres transparently as a fully encrypted database (not just on a table by table basis)?

Hard to tell what you're asking here. You can't just encrypt all the data inside Postgres, or it won't be able to perform queries anymore. You can read the CryptDB paper cited here to learn about the trade-offs that some encrypted databases have made in order to re-enable queries on encrypted data. In a nutshell, they use much weaker encryption that reveals a lot more information about the contents of the DB. Then read up on the attacks also cited here to see why that's generally a bad idea.

(Full disclosure: I'm an author of one of the attack papers.)


And don't forget the theft of back-ups. Back-ups tend to be secured much less well than the originals and are a frequent cause of data landing where it shouldn't.

Another way in which disks end up in places where they shouldn't be is faulty decommissioning processes.


I'm with zAy0LfpBZLC8mAC, what question are you attempting to answer? If you want to secure the database at runtime, your best answer is to have it locked down at the ports and the access ids. If you want to protect it at rest, yes, an encrypted drive would prevent someone from stealing the drive and reading the data. It won't help against anything else.

This might help as a primer for this type of question: https://security.stackexchange.com/questions/16939/is-it-gen...

Edit: grammar.


> Or alternatively is this a problem that's better / just as viably solved with an encrypted filesystem?

Which problem?


A few years back, Google worked through their attempts at encrypt MySQL from end to end; it's a damned hard problem when you consider all the bits which would need to be encrypted to protect against someone with access to the OS or hypervisor.

I haven't heard that they were successful.


> protect against someone with access to the OS or hypervisor.

> I haven't heard that they were successful.

Isn't it rather trivial to prove that a database can't be secure if the OS or hypervisor is compromised? (I mean, how could it be, since the OS/hypervisor can read/write all memory, pause execution, alter execution, etc. by its very nature?)


The original goal of the academic "encrypted database" proposals like CryptDB was security against this kind of adversary. The idea was that if all the data is encrypted even when you're querying it, you get some security against a compromised OS. Unfortunately, in this paper we showed even this "always encrypted" approach has many subtle flaws that leave the data vulnerable to inference attacks.


It's much better than what is out there in industry, however: plaintext data. Usually the attacker when you upload data to the cloud isn't the cloud provider, but outside intruders.

I feel it's not constructive to claim that solutions that provide intermediate security are useless; schemes that provide strong security (eg. ORAM based schemes) are far from practicality, and so these intermediate security solutions are the best we've got.


To be clear: nowhere in this paper did we claim any particular solution is useless. However, the degree to which these systems are useful, and what situations they are useful for, is not well-understood. Prior work has shown that the encryption used in many of these systems is breakable (i.e. the plaintext is recoverable with near-perfect accuracy) with simple attacks. See, for example, this recent paper (https://eprint.iacr.org/2016/895) on cryptanalysis of order-revealing encryption.

Respectfully, I find this "more secure stuff is slow so we have to live with what we've got" argument to be specious. There simply is no evidence that a fast encrypted database must also provide very weak confidentiality guarantees.


The evidence lies in the failure of the cryptographic community to provide a solution with strong security properties that is performant. For example, nobody has even attempted an ORAM-based database system. We also do not have schemes that can efficiently provide an intermediate level of security, between "weak" and "strong" systems.

Either way, as I said earlier, it's a question of threat models. Most cloud users trust Google and Amazon. These companies also have strong intrusion detection capabilities, so with non-negligible probability an outside attacker would be detected within a reasonable amount of time. In such a scenario, it is better to have some protection than none at all.


The failures of the academic cryptography community to provide solutions to real-world problems in general are well-documented, and I will not belabor them here (q.v. Rogaway's "The Moral Character of Cryptographic Work").

I will only say that I don't think this problem (strong security + performance) has been on the minds of very many people for very long, and this work is really still in its infancy.


Not without CPU support. Intel SGX, AMD SEV (sort of), and ARM trustzone (if you want to do a lot of work) make this possible.


It's worth pointing out that trusted execution for individual queries does not, in general, rule out the attacks discussed in this paper. If the database collects (for example) frequency information about queries, inference attacks can still be used to recover plaintext.

EDIT: The fundamental problem is that trusted hardware doesn't hide the access pattern by itself. Trusted hardware can be used to hide some kinds of access patterns, but it's highly nontrivial and has only been demonstrated in some limited settings. For example, there was a paper at NSDI this year called "Opaque" which showed how to use SGX to hide access patterns for some kinds of Spark queries.


I should really read that paper, since I'm sort of confused by the threat model. Arbitrary queries seem like they would defeat the point. So I'm assuming this "using a secure, authenticated channel to communicate out, while still being monitored by the OS" model. That's a high bar for software not designed for SGX.

I presume it's relying on the paging behavior of SGX? (Either page faults or dirty bits).


Better (any) in-memory encryption would help (especially against swap or hibernation attacks), but won't completely mitigate the risk due to the encryption key being stored in memory as well.


The focus of many in high-assurance security after 90's maybe was that using compromised OS's or hypervisor's maintaining security properties is best done with a hardware architecture that does the crypto at CPU level on individual processes, pages, and so on with the chip being the trust boundary. They've spent a while trying to improve on security and performance. Edmison's design was last one I saw that I liked with great section on prior work:

https://theses.lib.vt.edu/theses/available/etd-10112006-2048...


Whoa, really? Is this analysis available publicly somewhere? It would be really useful for further research on this topic.



Are there caveats for encrypting Redis databases at rest? The main attack vector I'm trying to thwart is becoming a pastebin dump.


A pastebin dump isn't an attack vector, it's what an attacker might do with the loot after compromising you.

Or are you referring to one of your personal passwords ending up on Pastebin? Use unique passwords, set up SSH keys, and set "PasswordAuthentication no" in your SSHD config.


If your threat model is having a hard disk stolen, simply using full-disk encryption or whatever Redis offers is probably fine. Fair warning, though: this is a really weak threat model and won't do anything to protect the data against any stronger attacks.


Is SqliteCipher (sqlite) also vulnerable to the techniques shown on paper?


This paper highlights various possible attack vectors. SQLCipher would still be vulnerable to scenarios such as an active attacker on device/machine where the key is resident in RAM. If you are interested in the design features of SQLCipher, I would recommend reading this: https://www.zetetic.net/sqlcipher/design/


> would still be vulnerable to scenarios such as an active attacker on device/machine where the key is resident in RAM

Doesn't this go without saying? Seems kind of ridiculous to try to protect against that attack vector at least on today's hardware.


To protect against an attacker with physical access to a machine may indeed be futile, but it should be possible to provide some measure of security against a malicious process running on the machine.

Defending sensitive data if the DB process itself is compromised, again, seems pretty difficult. That was the original goal of the academic proposals like CryptDB or Cipherbase - defense even against a fully malicious database server.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: