Deep learning sharpens views of cells and genes

klmr · on Jan 4, 2018

The “DeepVariant” example exemplifies the problem with deep learning in biology: some exceptions (when actual imaging is involved) notwithstanding, we still haven’t found a good use for it. Most problems that we can formulate in biology are solvable with linear classifiers: the models are purely additive. The nonlinearity of deep learning offers no benefit.

The article says that

> In tests, DeepVariant performed at least as well as conventional tools.

Which is technically correct but highly misleading. But a more honest way of saying this would be that

> DeepVariant performs no better than conventional tools, and given the nature of the problem there is no reason to expect further boosts in the future from this method.

This is pretty much what the rest of the field thinks, anyway (example of this view in [1]).

From a pure science point of view, DeepVariant is interesting: it applies a new technique to an old problem and shows that it works. This alone is exciting (to me personally at least). But in practical terms it’s useless; it does no better than existing methods and is far more complex and orders of magnitude less efficient.

[1] https://www.forbes.com/sites/stevensalzberg/2017/12/11/no-go...

EpicEng · on Jan 4, 2018

>This is pretty much what the rest of the field thinks, anyway (example of this view in [1]).

Yup. I'm not a CV engineer myself, but I work in between our CV folks and pure software engineering. I've worked on automatic labeling and classification of circulating tumor cells (focusing primary on the imaging system(s)) for the last five years. Every experienced CV person I've worked with or spoken with dismisses DL out of hand for what we're doing. The less experienced folk jump right to a DL solution without understanding the problem domain.

xyhopguy · on Jan 4, 2018

The biggest complaint about DeepVariant is that it isn't even a CV problem. I mean you start with a bunch of images, but those get processed and turned into basecalls, which get aligned to the genome, and then turned into an image again (why though?) for DeepVariant. Interesting approach to say the least.

aaavl2821 · on Jan 4, 2018

The advances in this field will probably come from people like the biologists quoted at the end of the article, who are using these new techniques to improve workflow and identify new techniques to explore biology

The google stuff, as others have said, seems...useless. Deepvariant performs just as well, but not better, than conventional methods. As far as being able to tell someone's age or smoking status or blood pressure, that is already done pretty well by...asking people, looking at birth certificates or using a blood pressure monitor

A few years ago, google publicized its ability to detect macular degeneration or some disease better than humans using deep learning. But it was only marginally better, not enough to change clinical decisions. And to actually implement that tech would be almost impossible given existing healthcare workflows, treatments and economics. The ability to predict heart attack from eye images is cool in theory, but they probably can't actually do that yet technically with a good enough specificity, and how would you get eye images on a regular enough basis for it to be useful?

voidifremoved · on Jan 4, 2018

> how would you get eye images on a regular enough basis for it to be useful

Could they be collected from all those devices with front - facing cameras people spend their lives staring at?

aaavl2821 · on Jan 5, 2018

I don't know enough about those cameras to know, but not sure if they'd be able to get retinal images of sufficient quality. My guess would be they can't but I dunno

xyhopguy · on Jan 4, 2018

even nature is hyping the verily paper now?

It's like I'm taking crazy pills. Variant calling was already very very good and grounded in statistics. Why ON EARTH do we need to convert it to AN IMAGE and give it 1000x more compute? The gains are like 1% too. That's barely above the error rate of sequencing.

klmr · on Jan 4, 2018

> The gains are like 1% too

If that. I’m sceptical (and so are others). Given that all methods are imprecise, it’s not terribly hard to find a special case on which method X outperforms all the others. But on average?

daemonk · on Jan 4, 2018

Isn't this the software that turns alignments into images to run through an image analysis pipeline? They just jerryrigged alignment data onto an image analysis system?

Does it offer speed/memory advantages over traditional variant calling?

xyhopguy · on Jan 4, 2018

It's worse than that. It turns imaging data (illumina basecaller) into sequences, aligns them, turns them back into images (nucleotides go into the same channel for some reason), THEN uses 100x more compute to get a questionable 1% gain on indels.

Why.

amelius · on Jan 4, 2018

Disappointed by the lack of images.

lainon · on Jan 4, 2018

the cited paper "Building a 3D Integrated Cell" (https://www.biorxiv.org/content/biorxiv/early/2017/12/21/238...) has images