Hacker News new | past | comments | ask | show | jobs | submit login
Encoding of a digital movie into the genomes of a population of living bacteria (nature.com)
91 points by skosuri on July 12, 2017 | hide | past | favorite | 38 comments



There are number of problems with this, the big one being the difficulty of reading and writing DNA. In addition, while storage may be very dense, it is somewhat environmentally sensitive. IE a single cosmic ray can cause DNA breaks. This principle has been proposed to make a dark matter detector using sheets of synthetic DNA on gold[0]. It's also been determined that DNA has a half life of about 521 years[1]. Of course DNA's overlap does help with this in addition to the fact we can massively replicate DNA. Being able to massively replicate DNA can let us copy the same data an inconceivably huge amount of times so that we have a huge amount of redundancy. In addition, this ability has let us sequence DNA from neanderthal bones from ~50,000 years ago.

But we might be able to do better using more stable molecules than DNA. Peptide nucleic acid or PNA is an interesting option for this. It binds together stronger than DNA while maintaining a lot of the benefits of DNA. Now if we're willing to throw away complementarity we could potentially store information with peptide sequences(proteins). We have recovered peptide sequences, albeit short fragments that were repeated numerous times, from dinosaur bones[3](although there are some worries with contamination here). Peptide sequences might get tangled up and will almost certainly be more difficult to read.

We could potentially use a similar chemistry to that used in plastics for even more stability, although synthesis of large amounts of long sequences will be extremely difficult.

[0]https://www.technologyreview.com/s/428391/revolutionary-dna-... [1]http://www.nature.com/news/dna-has-a-521-year-half-life-1.11... [2]http://www.nature.com/nature/journal/v505/n7481/full/nature1... [3]https://en.wikipedia.org/wiki/Peptide_nucleic_acid [4]http://www.nature.com/news/2007/070409/full/news070409-11.ht...


So, the Matrix was wrong we won't be batteries. We'll be data storage.


Interestingly enough, humans were never supposed to be used for batteries in the original script of the Matrix. Humans were supposed to be CPUs for the machines.

https://scifi.stackexchange.com/questions/19817/was-executiv...


Most of that idea comes straight out of the book Neuromancer.


I don't recall any reference on using brains as CPUs on the book.


I think it was just humans who were wrong. Every time (IIRC) that it's been indicated that humans were used as an energy source, said indication has come from a human. Human batteries are a "simpler" explanation for why the Machines want to keep humanity enslaved, and it's unlikely that any of the humans are really mentally prepared to recognize that they're actually the CPUs (and possibly even the data storage).


On a side tangent but loosely related to the mutation subject, does anyone remember songs that were deliberately designed to destroy music players? I ask because this would be an interesting attack vector. You could deliberately design a file that when compressed and stored in DNA would code for anything in the biosphere. Just an initial thought.


http://rateyourmusic.com/artist/ryoji_ikeda

He iirc has made music that is intentionally damaged.


You forgot to say "asking for a friend". :)


Digital but physical viruses? Yeesh!


So if I encode Metallica into my dog's cells will the RIAA come after me?


Only if you let your dog breed?


No, but the Sandman might.




Does sci-hub give you the supplements? Most of the info is in the supplements for papers like this. That link is to the supplement.


Most publishers dont put supplements behind thier paywall. Scroll down for the supplement.


Yes, I go here: http://www.nature.com/nature/journal/vaop/ncurrent/full/natu...

The I click on "Supplementary Information (324 KB)", which leads here (which is dead for me): http://www.nature.com/nature/journal/vaop/ncurrent/extref/na...

So it is impossible to see the real methods for this paper.


Free to Read Albeit Horrible Interface Version: http://rdcu.be/t9oS


Oops honey, by screwing with these genomes I've accidentally created a unstoppable flesh eating bacterium.


Obligatory Dresden Codak: http://dresdencodak.com/2009/07/12/fabulous-prizes/

edit: Sheesh, tough crowd.


While I like the comic, I really wanted to give up before I got through the actually relevant/interesting bottom. The whole setup was really bad for someone who doesn't know and doesn't care about the characters. The author really could have just skipped directly to the "I'm going to go write over my junk DNA" bit and it would have been fine.

Edit: So I assume people gave up and downvoted for the seemingly random comic.


Thank you for sharing that, I thought it was awesome.

I had no problems with the scroll of the site, so have an upvote so that others may see the link.


I love Dresden Codak and this one is spot-on. I don't get the downvotes.


The site hijacks scroll which is pretty upsetting.


Does that warrant downvotes, though?


That you can store large amounts of data in DNA, and recover it, should be surprising to nobody. It should also be surprising that no matter how well it works, it won't ever become a viable product.


There's no reason storing information in a population of DNA molecules is fundamentally unstable.

Yes, mutations occur. Randomly. This means that if you sequence a set of DNA molecules the consensus sequence can still render the proper data string despite flaws in each individual copy.

In the context of generating organisms like this, so long as your stored information is not in a loci that could be mutated to regulate proliferation in a positive way, mutations in your storage string wouldn't propagate in a consensus manner (in some organisms, it's hard to make them grow faster. E. coli for instance have already been optimized to divide pretty darn quickly by evolution).

The same is true for populations of DNA molecules rather than organisms (are any organisms much more than DNA storage and maintenance machines :-D?). Random mutations in one molecule in a tube would be different for each neighboring molecule, but the consensus sequence is still valid.

This all makes perfect intuitive sense when you recognize that even some DNA sequences with no positive effect on growth are fairly well conserved on evolutionary timescales. https://www.wikiwand.com/en/Long_terminal_repeat https://www.wikiwand.com/en/Alu_element


My criticism has nothing to do with the reliability of the storage mechanism.


why not? what if it is programmed to replicate itself to provide redundancy?


Actually, you'd start with an erasure code (as the authors of these various papers typically do); the redundancy there is more effective than mirror replication. Copying is error-prone (10-6 - 10-9) and repair isn't that much better (10-9 - 10-12).


Mutations would occur, yeah? Mutant files, lol.


Every medium decays; if you don't estimate longevity and perform error correction/migration/etc. accordingly, you won't be immune to data loss, just blindsided by it.


People who do this use erasure codes.

Also, DNA kept in a dry, cold environment has an effectively infinite lifespan. Now, the interesting question becomes "what fraction of the the lifespan of the universe can I maintain my data with arbitrary accuracy?"


the biggest issue my non-geneticist mind sees with this is the issue of mutations.


No, the reasons are about business any need. What does this product provide that anybody needs right now? Storage providers are not limited storage device availability - storage production is driven by demand from the providers, and the producers will make more if demand goes up.

Next, existing technologies are more mature. They provide off-the-shelf products that integrate into a large ecosystem. So, DNA would have to provide some niche value that hard drives or flash drives don't.

Finally, with durable storage mechanisms, you don't really know how well they work until you deploy many instances of the mechanism for their lifetime; for DNA storage, that would be billions of cells (or more likely, passive storage containers) for decades or more.

So, I'll go with tapes, hard and flash drives and everybody else will too. MSFT made some claims about rolling out DNA_based storage in the decade-future, but they're blowing smoke.


Your reasoning is flawed (and frankly, phrased slightly annoyingly in your other comments, in a way where you omit your core arguments in what seems like baiting for replies). DNA storage provides durability [1] and information density (per weight) [2] beyond any current technology.

1: http://onlinelibrary.wiley.com/doi/10.1002/anie.201411378/ab...

2: http://science.sciencemag.org/content/355/6328/950


That's not higher density because nobody actually made something that stores a unique terabyte in DNA. The Science paper stored a megabyte. Ewan Birney's paper stored a concatenation of the same string over and over again.

Showing you can store a tiny amount of data and then claiming you have a density higher than hard drives isn't just poor marketing, it's outright lying.

To show durability, you need to actually test things for the planned lifespan, or use intelligent mechanisms to test in a shorter lifespan. Then you can claim durability.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: