Hacker News new | past | comments | ask | show | jobs | submit login

Help me understand this...

It says it will support prefix search, substring search, and the like. Can anyone point me in the right direction on what the algorithm may be here? I don't get how you could do those things without making the encryption less secure and/or decrypting every record the fly.

Another interesting use case I found that isn't mentioned here is sort. I've had customers ask me to be able to sort the results by PII and we tell them... no, we can't do that because the field is encrypted.




These things are indeed possible while maintaining fully semantically secure encryption. Recent, mostly theoretical work shows that this is possible using fully homomorphic encryption. The basic idea is, the client can encrypt its query, the server can process the encrypted query and produce an encrypted result, and send this back to the client. It sounds impossible, but it isn’t! Very cool stuff. There are actually also some practical implementations that work… so it’s gradually exiting the “theoretical only” stage.

MongoDB is very short on details, and I suspect they do something worse than homomorphic encryption, that does indeed make some kind of compromise between privacy and convenience.


Yeah, they contrast their method with homomorphic encryption, which makes me share your suspicion


Searchable encryption trades privacy for efficiency. However, the privacy loss can be tuned. For example, SE constructions will specify whether they leak search-pattern (how many of the same queries a client makes), access-pattern (the frequencies with which different items are accessed) or other things. Usually, a client can pay in storage/bandwidth to mitigate these leakages.


Yeah, I've been looking for more information and I can't really see any indication as to how they are planning on implementing it. The whole thing seems more like marketing than actual innovation: searching encrypted data isn't that complicated if you are always dealing with the entire ciphertext, it's just another string in that use case.


> searching encrypted data isn't that complicated if you are always dealing with the entire ciphertext, it's just another string in that use case.

This isn't really true because there are multiple ciphertexts that can decode to the same plaintext in any modern encryption algorithm. If you skip that property you weaken the encryption. (Chosen plaintext attacks)


it's not complicated if they are using deterministic encryption - which brings it's own issues


It is less secure than your standard symmetric encryption. I guess they would use deterministic encryption in which 2 entries with same email address will have the same record string ( this leaks information to attacker ). Prefix search & sort can be achieved by using order preserving encryption. Not really sure about sub-string though.


I've researched order preserving encryption before but the tradeoffs (mainly that the attacker can tell the order and use that to narrow the search space) always seemed like high risk.


High risk compared to what? The alternative is absolutely no privacy (status quo) or no/limited functionality (not very useful). Seems like strictly better than having no privacy.


Depending on your compliance needs and the sensitivity of your data, "limited functionality" may be a reasonable tradeoff, though.


Using fake encryption is much riskier than no encryption, because if you think you are safe you will do unsafe things with your data. If you know you are unsafe then you will take appropriate precautions.


Related video explaining encryption schemes to make encrypted data in a DB queryable:

CryptDB: Processing Queries on an Encrypted Database

https://youtu.be/xsaXMUelOEA?t=807


I was under the impression that cryptdb "encryption" was thoroughly broken. Am i mistaken?

E.g. googling i found http://cs.brown.edu/people/seny/pubs/edb.pdf


Not broken according to the response to that paper:

the conclusions drawn by this paper with regard to CryptDB's guarantees for medical applications are incorrect: had the guidelines been followed, none of the claimed attacks would have been possible. [1]

[1] https://css.csail.mit.edu/cryptdb/response.html




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: