Hacker News new | past | comments | ask | show | jobs | submit login
A programming language for living cells (phys.org)
216 points by ctoth on April 3, 2016 | hide | past | favorite | 47 comments



> "You could be a student in high school and go onto the Web-based server and type out the program you want, and it spits back the DNA sequence."

Woah, hold your horses right there.

One of the first things I did when I learned to program in high school was to write a virus.

So I'd think twice before making a "Web-based server" which spits out DNA and letting everyone use it.

It's not just about high schoolers or terrorists.

"The only way to solve the issues that the world is facing is for the population to drop by 60%".

These were the words of a successful tech entrepreneur, who's lived in a dozen countries, fairly young, open minded and intelligent who I had dinner with a couple of days ago.

And I see his point - the overpopulation issue is real - but I'm scared that this idea is taking roots in our minds, even in the minds of people who should know better.

This was spoken in the context of a discussion about war, but why do war, when one person can design an organism that wipes out 60% of humans ?

Our consciousness is not evolved enough to handle the ability of designing random organisms and viruses.

I've voiced this opinion before, but the community seemed to not agree with it much. But I'm still standing by it - be very careful with this tech.

In fact, I think this shouldn't be allowed to get out of the lab. Not yet.


Design tool like Cello produces a string of characters representing the sequence of a DNA molecule, not a physical molecule of DNA. To actually do something with this in the real world, you would need to place an order with a DNA Synthesis company (assuming you don't own a DNA synthesizer yourself). Fortunately, these companies screen sequences for things like genes encoding toxins, and work with law enforcement agencies. See for example Marcus Graf's talk at SB6.0: http://sb6.biobricks.org/session/day-2/assessing-risk-managi...

Also, the capabilities of this tool are rather limited. It is not clear to me how being able to design a small combinatorial logic circuit would directly help a potential bioterrorist.


I'd guess this type of screening only works because we're currently very bad at programming cells. Checking whether a DNA sequence is malicious can't be simpler than checking that a computer program is malicious. And we all know how well that works.


It's arguably easier to check for malicious DNA than malicious computer programs. The motifs for pathogens are pretty common. It's also highly unlikely that someone will create something totally novel, and the experiments required to test malicious properties of a bioagent would be somewhat detectable.


> It is not clear to me how being able to design a small combinatorial logic circuit would directly help a potential bioterrorist.

By advancing this technology.

I assume (hope) we're not yet capable of spitting out DNA out of a USB synthesizer. But given the speed with which technology evolves, it wouldn't be too wild to speculate that 10 or 20 years from now this might become possible.


"I assume (hope) we're not yet capable of spitting out DNA out of a USB synthesizer."

Nope. They're off-the-shelf devices. There's even a used market: http://www.labx.com/dna-synthesizers


It would be really hard to synthesize something of any significant size using these instruments.


Not only the capabilities of this tool are limited - so is our understanding of many of the fundamental concepts that might give us control over the biofactories that are living cells e.g. think protein folding.


Given the state of the software industry, your fear of any TDH gaining the ability to create literal bugs by typing in a program is indeed well founded.


> In fact, I think this shouldn't be allowed to get out of the lab. Not yet.

I'd rather have a million bio-hackers who can design an organism that attacks your hypothetical organism than only a few scientists working in a lab who get orders from a not necessarily well-meaning, humanitarian government.

And I don't want to live in a society where a small group of people can decide which scientific knowledge is allowed to spread and which is not.


The atomic bomb is the sort of scientific knowledge that many regret to have created - including the people who were in charge of their creation - Oppenheimer in the US and Sakharov in the USSR both spent their lives fighting for nuclear non-proliferation, after having seen the bomb in action.

A much wiser approach would have been to not create these weapons in the first place, but that's a long and complicated discussion.

Now we have books like "Unmaking the Bomb", which are interesting to read but are impossible to implement, because scientists are no longer in control of this knowledge.

And this is the point I was trying to make (probably in vain, I know), that it is wiser to not open some boxes, than to squeeze the monsters that come out of them back inside.


No, I got your point.

But if we want to limit the dissemination of potentially harmful knowledge we'd have to stop teaching basic physics, chemistry and biology. The fact that we do not have daily TATP explosions or Anthrax attacks or that there is a strong movement for nuclear non-proliferation shows us that we are not too bad in handling the responsibility that comes with knowledge.

And not all about the early nuclear research was bad - without it, we'd not be where we are in space exploration or medicine.


The box, once found, will be opened. It's only a matter of time.


> "The damn program should just work."

> "Cytoplasm isn't exactly a eutactic environment. Certain operations just fail."

-- Deus Ex (2000)


but the organic molecules that make up the organic compounds that make up that cytoplasm sure are

i would suppose one could extrapolate then that the cytoplasm must be too

just significantly more multidimensional

i'd also add it is my intended inference that you can also go the other direction, and say the elements and atoms and on that make up those organic molecules are as well



The drift from predicted in three component systems is telling.


Here's a Github repo apparently: https://github.com/CIDARLAB/cello


Does this look a bit like a flux simulator? I haven't got access right now. But from the abstract and the pictures it makes a sound a bit as if. Basically model and simulate the flux of metabolites via connected functions, linear-, step-, etc. It looks a bit like Matlab's SimuLink? Will read tomorrow and amend if necessary.


What exactly does it mean when one of these "circuits" "responds to conditions"?


The circuits are combinatorial logical circuits implemented as transcriptional circuits: they produce a particular protein (in this case YFP - yellow flourescent protein) when they receive an appropriate combination of inputs (presence of absence of the small molecules IPTG, aTC, L-arabinose, respectively).


Ah so these circuits are useful because if you're trying to construct cell organelles you need to be able to trigger release different proteins at different stages of development?

Is any of that protein folding stuff useful for this?


Seems to be real, despite the date: http://www.cellocad.org/about.html.


For anyone trying to access the site, you annoyingly have to register before going to anything other than the index.


Is this a joke?


No, but it's not as interesting as it sounds.


Why not?


Two reasons I can think of; biochemical reaction pathways can be tremendously complex with subtle feedback mechanisms that are non obvious, secondly, DNA encodes mRNA that encodes proteins; determining protein folding from the sequence of DNA is exceeding difficult, so, just because you can 'write DNA programs' doesn't mean you can actually do anything with the result. Both of these factors taken together mean, that with current technology, it's really hard problem to modify DNA, to influence existing biochemical processes, or create new biochemical processes. I studied molecular biology in the 90's so I'm a bit out of date, but I believe this is still the state of affairs.


Engineering new circuits isn't the same kind of problem as understanding natural ones, just as designing a protein to fold predictably came many years ahead of predicting the folding of natural proteins. (The former started in the 80s, and the latter is afaik still very hard, as you say.) http://www.dna.caltech.edu/courses/cs191/index.html gets into some of the recent progress (under "Nucleic acid circuits").

That said, it's not my field and I get the impression they're talking more about in vitro work on the page I linked. Even if your circuits work all the time in the lab, in a cell I'd expect all kinds of things to mess with them.


'Nucleic acid circuits'/'DNA circuits' are conceptually entirely different to 'genetic circuits'.

In the former, you design sequences of DNA such that complementary base pairing means they can displace each other in cleaver ways. This lets you create some interesting things, like oscillators [1], amongst others [2]. These do not need any of the apparatus of the cell to function, so work in solution; indeed, if they were inside cells they would get digested by nucleases. The thermodynamics of DNA/RNA folding is fairly well understood, and the range of structures in much more limited than that of proteins. A major drawback of these circuits is that they function very slowly.

By 'genetic circuits', people usually mean a genetic regulatory network [3] - essentially you combine existing genes in new ways, by chaing the regulatory sequences before each gene. For example, you can construct an oscillator from three genes by having the first repress the second, which represses the third, which represses the first [4]. Here you aren't designing new proteins (which is extremely hard), but rather modifying existing ones. Since these circuits require producing new proteins from DNA, they require RNA polymerase, the proteosome, ATP, the necessary monomers etc. so can only function inside a cell (or cell-free expression system containing these components).

[1]: https://www.researchgate.net/publication/50304896_Programmin...

[2]: http://research.microsoft.com/en-us/projects/dna/

[3]: https://en.wikipedia.org/wiki/Gene_regulatory_network

[4]: https://en.wikipedia.org/wiki/Repressilator


Nest three while loops and a five if...then conditionals and see if you get what you expect.


It just automates what was previously something done manually. But it only works 75% of the time. It's a nice tool to have if this is your domain but it's not like you get any guarantees about success. Biology is still stubbornly complex.


Given that you can run the same program on a billion bacteria should quite improve the odds of getting the job done.

Surely, also one has to account for possible other "jobs" that get done due to interference.

However, if your goal is to automate processes rather than develop cures to "run" in the human body then this is a very interesting alternative to using silicon, the parallel pipeline potential is enormous.

EDIT: Would it be possible to develop a biological CPU this way? I.e. having "instruction sensors" and a touring-machine-like DNA-robot that can execute externally supplied instructions? Putting that into a bacteria that can clone itself would surely cut down on costs of computing.


>Would it be possible to develop a biological CPU this way? I.e. having "instruction sensors" and a touring-machine-like DNA-robot that can execute externally supplied instructions?

No, it is not possible (not this way). Tl;Dr how do you plan on storing information on the Turing machine tape? If you're happy doing computation with a relatively high stochastic failure rate things look better, but I wouldn't count on it.


75% success rate for systems with this complexity (>10 elements) is probably just as good if not better than what would be expected by designing “by hand,” and the latter approach does not scale past the current order of magnitude (to 100+ component designs). I think that’s their main argument.


A major problem with further scale-up is the availability of parts. In an electrical circuit the signals voltages that are constrained to wires; in a a genetic circuit the signals are concentrations of proteins/compounds that are diffusing around the cell. This gives a problem: to have independent logic gates you need transcription factors that will bind to distinct promoter sequences, without crosstalk. If you're doing this in a cell rather than cell-free system you also need to avoid crosstalk with the host cell.

The problem isn't so much in designing the circuit abstractly as finding specific parts with which to construct it. One approach is to partition the circuit across multiple cells [0, 1].

[0]: http://www.nature.com/nature/journal/v469/n7329/full/nature0...

[1]: http://journals.plos.org/ploscompbiol/article?id=10.1371/jou...


Eukaryotic cells solve the "crosstalk problem" by building the control regions that regulate gene expression out of modular, hierarchically-organized binding sites for multiple transcription factors (TFs).

In prokaryotic systems there is (in a very approximate, generic sense) a one-to-one correspondence between the concentration of a particular transcription factor and the expression (or repression) of the genes downstream of the binding site for that transcription factor.

The control regions in eukaryotic genomes have binding sites for multiple transcription factors, combinations of which may become binding sites for other transcription factors (larger TFs which bind to certain combinations of smaller TFs), etc.

In this way, the specific sequence of TF binding domains in the regulatory region of a eukaryotic gene provides a particular and potentially unique "address" in "Transcription Factor State Space" by which the gene can be controlled.

For more information on this amazing topic, check out "The Regulatory Genome" by Eric H. Davidson. Here is an excerpt from the first page of chapter 4:

"Whatever their extent, however, development gene regulatory networks have an internal structure, in that they are composed of diverse kinds of modular parts and connections among these parts. Here 'modular' takes on a simple functional meaning: it is used to denote small subsets of genes within the overall network that together execute given 'jobs,' e.g., to operate a certain differentiation gene battery, or to transduce an extracellular signal into a certain regulatory state.

In what follows, sets of regulatory genes that execute modular functions are usually referred to as constituting 'subcircuits' of the network, because as we shall shortly see they are 'wired together' within the subcircuit by their gene regulatory interactions. Just as the target site inputs of an individual cis-regulatory [note: cis- prefix in this context indicates gene regulation via non-expressed sequences of DNA proximal to a gene in the genome] module are integrated to generate novel outputs according to its genomic design, so the outputs of these subcircuits are integrated to generate logic outputs which depend on their organization, that is, their wiring architecture."

- https://books.google.com/books?id=F2ibJj1LHGEC&pg=PA126

[edit 1: added link to google books & excerpt]


How is that not interesting? This discovery can speed up engineering new bacteria that helps us cure allergies, produce energy, etc., We still need to go through usual trial and errors but this makes them significantly easier and faster.


The most intricate April Fools joke this year


Yes.


Yeah, but we're actually not too far from this.


why verilog?


EDIT: I am assuming you meant "Why verilog as opposed to go/ruby/c/etc?" rather than "Why verilog as opposed to system verilog / VHDL / etc." If not, hopefully someone else finds the explanation useful anyway.

Because at the physical layer everything "executes" simultaneously, all the time, in parallel, and verilog/vhdl/etc embrace that type/level of abstraction while sequential languages don't. Sequential execution of "instructions" is one level higher than the "everything is parallel" layer -- you put it in place if it makes sense to impose strict reliability requirements and sacrifice raw parallel computational power for the simplicity and versatility of your favorite Turing machine. Unfortunately, biology isn't nearly as reliable or as fast as silicon, even "early" silicon, so the tradeoff won't make sense for some time to come if ever.


I DID mean why verilog as opposed to system verilog, VHDL, “etc.,” or whatever the verilog of 2020 is.


It's currently only feasible to produce very small circuits (the paper goes up to 8 gates and 3 inputs), and Cello is currently focused on a narrow class of circuits ("asynchronous combinatorial logic without feedback"). The first step of their algorithm is to convert the Verilog code into a truth table.

When your circuits are so simple, I suppose it doesn't really matter what language you use to specify them, and the authors just went for whatever they thought was most convenient to parse.


The authors want to create Electronic Design Automation (EDA) tools to facilitate engineering genetic systems at scale. Is verilog a reasonably modern foundation from which to draw inspiration, or are there "better" EDA tools nowadays? Maybe one of the resident HN semiconductor experts can weigh in on this.


"Anything can go wrong?"

"Don't be a buzzkill, man!"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: