Hacker News new | past | comments | ask | show | jobs | submit login
Detailed simulation of complete minimal cellular life (cell.com)
155 points by akimball on Feb 26, 2022 | hide | past | favorite | 31 comments



This seems like a really interesting work in terms of synthesizing and systematizing the known reactions/fluxes in a minimal cell. However, the paper seems to be like many other modeling studies in the field of microbiology - there are either no, or very few testable predictions made. It's hard to think of the model of metabolism this paper presents as valid without at least some idea of it's predictive accuracy. The concordance with previously published results does not seem like very strong evidence.


It's wonderful that the entire paper is available freely from the journal. Most journals these days are pay-for-access (and none of the payment makes it to the authors).


The code is on Github:

https://github.com/Luthey-Schulten-Lab/Minimal_Cell

https://github.com/Luthey-Schulten-Lab/Lattice_Microbes

This is an impressive result. They've found a nice level at which to simulate. Down at the atom level would take too long. They got this to work at the chemical reaction in 3D space level, low enough that the model maps directly to the biochemistry.

They don't seem to say how many GPUs they used, but the article seems to indicate small numbers, not entire data centers. Current speed is about 1/30 of real time for one minimal cell, which is pretty good.


Looks like they used up to 1 GPU per simulation:

> The GPUs used for spatial simulations included NVIDIA Titan V and NVIDIA Tesla Volta V100 GPUs, which took 10 h and 8 h to simulate 20 min of cell time, respectively.

So apparently they tried 3 different GPU configurations:

1. A single Nvidia Titan V, for a non-mixed model, which took 10-hours to simulate 20-minutes.

2. A single Nvidia Tesla Volta V100, for a non-mixed model, which took 8-hours to simulate 20-minutes.

3. No GPU's, for a well-mixed model.

Regarding mixing: A well-mixed model is one where all chemicals are assumed to have spatially-equal concentrations throughout a phase. By contrast, a non-mixed ("spatially resolved", in the paper) model models chemicals as having concentrations that can vary from location to location, making them more realistic but also far more expensive.

For anyone surprised that they only needed up to 1 GPU per simulation: their models appeared to be largely based in rate-kinetics; it's not like they were simulating atoms or really even molecules, but rather more like they were simulating concentrations of molecules.

To note it, they probably couldn't get a linear-speedup by using more GPU's. The thing's that they needed the GPU's to solve interactions over a space; if they had multiple GPU's, then they'd have needed to actively connect them on each iteration, slowing down the simulations. For a small number of GPU's (like maybe 2 to 4), they might try for a near-linear speedup if they fudge the boundaries a bit (allowing computational-artifacts at the edges between them).


That sounds good. So single-cell simulation needs only a modest level of hardware. When simulation progresses to multicellular organisms, inter-cell connections can probably be done less frequently, over a network. C. Elegans, the nematode worm that's the minimum viable heavily studied organism, has 959 cells. A thousand GPUs is not an out of reach number. Just a big cloud bill.


I really enjoyed the paper, though I'd probably characterize it as primarily academic.

This paper's got a ton of background research that's gathered up a lot of relevant data and models. The computational-results are nice, too, in providing a perspective on simulation-costs.

Once there're viable cell-simulations, we can do stuff like have computers predict, say, medicines by just simulating random molecules and optimizing them for desired-effect. A really powerful tool to look forward to!


>For anyone surprised that they only needed up to 1 GPU per simulation: their models appeared to be largely based in rate-kinetics; it's not like they were simulating atoms or really even molecules, but rather more like they were simulating concentrations of molecules.

Ah so it's sort of like CFD but with chemical interactions thrown into the mix?


Yup, you're right: their more complex model was largely based in [continuum mechanics](https://en.wikipedia.org/wiki/Continuum_mechanics ), just like CFD.

Without specifically checking, I'd guess that they probably ignored some of the stuff involved in normal CFD, e.g. pressure-driven flows. In fact, their well-mixed model (their simpler model that didn't need a GPU) basically ignores fluid-dynamics entirely, since it ignores spatial-variation and so there're no fluid-dynamics to model. Their non-mixed model (their more complex model that did use a GPU) probably relied on diffusive-transport, without regard for stuff like pressure-driven flows.

So, yeah, CFD -- lighter on the mechanical-dynamics (like spatial-flows) and heavier on the chemical-dynamics (like chemical-reactions).


Could be wrong but it’s possible the authors had to pay tens of thousands to make it free.


NVIDIA published a headline on their site. I could imagine them bankrolling the open publication. Good PR, good science it's a win-win and a trivial expense.

https://blogs.nvidia.com/blog/2022/01/20/living-cell-simulat...


NVIDIA has GPU grants for research.


schulten, the author of the paper, has been well funded at UIUC for decades. These sorts of simulations are Schulten's area and she's well funded (her husand is also there and is also well-funded, with near-infinite compute resources available). It's unlikely that schulten would bother with a small grant, leaving that more for starting-up investigators who can't afford more hardware.


I mean, I applied for one despite my lab being very well funded, it was maybe 30 minutes worth of work?

But that was merely a note for "How this could have gotten on Nvidia's site" that isn't "Nvidia paid for it all."


Looks like about $10K.

https://www.cell.com/open-access


Simulation seems like a step that requires knowing the initial state of a system. I wish there were more efforts focused on finding ways to scan the constituents of a cell all the way close to the atomic level. As far as I know the closest thing we have are Atomic Force Microscope and Scanning Electron Microscopy both several-decade-old technologies. They're also pretty invasive and require access to the surface area of the object to be scanned.

Knowing the position and charge density of atoms in a cell is required to simulate it.


Nice! This looks like really important work. As we get closer to being able to completely simulate more complex cells, it becomes possible to do the sorts of analysis that leads to not only understanding the cell, but how to tune cells to do exactly what we need them to do. I wish I had a broader bio background while reading the paper!


Can't wait until we get fast enough computers to simulate this interactively. Biology education would be so much more accessible for everyone.


Computers being fast enough to do this in biology education is quite along way out. They probably used a farm of Nvidia racks for several months to obtain the simulation.

Personally I doubt that an atom-scale simulation of a cell is very useful for education. Rather, I’d approach it in a hierarchical fashion, where progressive coarser scales are used to model progressively larger systems.

This way one learns not only about a cell, but also about principles of modeling, and that each model comes with assumptions, simplifications, and errors.

Choosing the right scale of modeling for the problem is crucial. In this case they have chosen a very fine scale to model a rather large system. But what’s the scientific insight gained?


> a farm of Nvidia racks for several months

Ok so what my phone GPU will do in a day less than ten years from now.


On what level was this simulated? Atomic?


It's a phase-level simulation.

Like, in highschool/undergraduate Chemistry classes, students sometimes calculate chemical-concentrations for [equilibrium reactions](https://en.wikipedia.org/wiki/Determination_of_equilibrium_c... ). This paper's basically at the same level.

Unlike highschool modeling, though, this paper:

1. Uses dynamic models, where things evolve over time (rather than everything being at-equilibrium).

2. Involves a large number of interacting systems (rather than just one system by itself).

3. Involves a lot more species, reactions, and complexity than one'd generally see in a classroom.

4. Involves spatially-variable concentrations, where chemicals have different concentrations at different points in space, rather than being the same throughout (well-mixed). (Actually, they did this both ways: they did the simpler well-mixed model without needing a GPU at all, then they used a GPU in their non-mixed models.)

Each one of those generalizations can make the problem, say, an order-of-magnitude more complex, if not more. Combined, they create a much larger model.

However, it's still largely based around that phase-level scale that students might be familiar with.


From the paper:

“ Because of the large variation in timescales and concentrations, developing a whole-cell model that treats metabolism, genetic information processes, and growth can, at the moment, only be achieved by hybrid stochastic and deterministic simulations. Kinetics of the essential metabolic network (Breuer et al., 2019) are handled deterministically via ordinary differential equations (ODEs), and the kinetics of the genetic information processes are handled with stochastic simulations.”

This reminds me of a description of how different levels of realism were chosen and combined to simulate a whole person in a Greg Egan science fiction book (Permutation City - I think). IIRC there is a scene where a character spends some time in the shower knowing that the partial differential equations for simulation of the water will be driving up the costs of running the simulation.

Edit: added the bit about Greg Egan.



It is what I would call the biochemical level, not the quantum chemical level. It is sufficiently detailed that the functions of genes can be observed and analyzed.


This thing still has billions of atoms in it, so that's not even close to possible.

It's also telling how far this is from something that could plausibly originate from a pre-biotic soup. There's enormous complexity here. Origin of life is very far from a solved problem.


Why is it impossible?

I tried to calculate how many molecules of water can fit into a sphere with radius of 500nm and I got 17 billions. Doesn't look like a big number for modern hardware.

And if modeling billions of atoms is impossible, cannot we model separate reactions? Put several virtual proteins and RNAs near each other and see how they interact?


To verify you work, btw, water is 55moles/liter (1 liter = cubic decimeter).

Modelling billions of atoms is possible, but not scientifically fruitful.


I don't think we can accurately simulate even a single enzyme.


No, several levels higher than that


How do you define one level?


However they seem fit? You just hope they were reasonable in their assumptions. Minimally it’s not atomic, simulating the behavior of all the atoms even in a single protein molecule is quite hard (hence alpha fold). So it’s not at that level, or even at the individual protein molecule level. They’re mostly using equations that try to match observed reaction rates of all the pathways in their minimal cell.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: