Emergency doctor here, I have to say that diagnosing a sick child is one of the ...

philipodonnell · on Aug 28, 2018

> Anyone who can build an ML model to catch these clues is going to make billions.

You'd be surprised. IBM poured billions into Watson and appears to have been pretty successful in nearly reaching parity with a certified oncologist, but the results were dismissed because it didn't outperform them.

> At first, Manipal used Watson to recommend treatment options for all cancer patients, said oncologist S.P. Somashekhar. It found the software agreed with doctors most of the time, so Manipal stopped using Watson on every patient, he said.

https://www.wsj.com/articles/ibm-bet-billions-that-watson-co...

freeone3000 · on Aug 28, 2018

That's not a great definition of parity. We'd want accuracy and specificity numbers linked to outcomes, not concurrence. The times when Watson agrees with doctors is effectively irrelevant - results would be the same whether or not he was added. We need to highlight whether, given a disagreement, Watson was better or worse for outcomes.

philipodonnell · on Aug 28, 2018

The trick is whether the cases of disagreement were themselves predictable!

Given a fixed number of oncologists and deploying Watson only to support those oncologists, yes, you're correct that its only useful if it outperforms them. But I think of it more like Watson is a single hive-mind team of like a thousand med students near the end of residency: they get most things right but there are a few places where more experienced doctors will be better, though the scale with the hive mind is far higher.

You have one of two reactions to that. 1) Hire fewer senior oncologists and have them focus on the more difficult cases and leave the hive-mind to deal with thousands of routine cases, or 2) ignore the hive mind until its literally better than a typical senior oncologist.

The medial profession seems to repeat this cycle of "only full doctors can do anything because even seemingly routine cases might be hiding something more serious" to "maybe some routine things can be done by people with less training and full doctors should focus on the more difficult cases". See nurse practitioners, dental assistants, and, in my mind at least where we are going with things like Watson.

Bartweiss · on Aug 28, 2018

I think part of the problem for Watson is that it needs someone to gather the data, which is usually a doctor. So if you're pairing each patient with an oncologist for intake anyway, it's not clear that "examination plus enter all data into Watson" is a benefit over "examination plus make a decision".

I guess the ideal outcome for Watson (if it doesn't outpace expert oncologists) would be something like "experience nurse practitioner does an exam, and enters data into Watson" or maybe even "special oncology-trained NP does an exam with Watson".

The other part I don't know is what oncology accuracy rates look like. If the reason to not majorly expand screening is cost and availability, Watson could be huge. If it's false positives from our existing rate, there's a lot less value.

freeone3000 · on Aug 30, 2018

Let's look specifically at mammograms. (Stats from CTFPHC, a division of PHAC, part of the Canadian government.)

https://canadiantaskforce.ca/tools-resources/breast-cancer-2...

Wider screening isn't great. Essentially, people with medical problems self-select pretty alright already, and wide early-screening initiatives for most cancers introduces as much or more false positives (that persist through screening!) than actual cases of cancer it catches - and moreover, it's not even clear that the early screening is effective - the false negative rate is high enough that the overall incidence of advanced cancer is unchanged even with early screening. (Bleyer and Welch, NEJM, 2012)

Basically, we need better screening, not more screening at our current levels, and it's not clear whether watson can provide that.

Bartweiss · on Aug 30, 2018

Precisely - since mammography is the standard "counterintuitive Bayes rule" primer, I remembered those numbers don't support wider screening.

I'm not discounting the possibility of a useful role for Watson in wider screening, but it's not clear to me where it would be. If it happens after any kind of extensive examination, doctor-hours are being committed regardless and there's little gain. If it happens at a population-level screen like mammograms and colonoscopies, "almost as good as oncologists" isn't enough to add any value.

Dylan16807 · on Aug 28, 2018

> The times when Watson agrees with doctors is effectively irrelevant

If your question is "can Watson improve treatment over an oncologist?" (ignoring issues of expense and availability), then it doesn't matter.

When you're talking about how close it is to "parity", it matters a lot.

If Watson exactly matched the oncologist in all but one out of a million cases, and in that millionth case caused the patient to explode, that would be extremely impressive, and Watson would be extremely useful. It would be nonsense to exclude all the matching cases.

Bartweiss · on Aug 28, 2018

> When you're talking about how close it is to "parity", it matters a lot.

I guess the question there is how valuable 'close to parity' is. If you can treat more patients, or treat them faster, or even much cheaper, then close to parity is a big deal. If examining patients and avoiding destructive false positives are the limiting factors, it's not clear that parity-level decisions are a significant benefit.

kashprime · on Aug 28, 2018

I think better diagnostic testing is one way to solve this. Oncology in some ways is easy since the data input is usually a set of biomarkers particular to that tumour (i.e. estrogen receptor, her2/neu, etc). It's probably the most rapidly advancing field in medicine. Same with radiology and dermatology where the input can be images which can be standardized. Dealing with analog humans is another challenge entirely!

Better biomarkers for acute illnesses with fewer false positives would help you better test 'accuracy' versus an ML model.

ggm · on Aug 29, 2018

A recent post here suggested Watson was recommending highly disadvantageous treatments and MDs became reluctant to continue because it wasn't working.

j16sdiz · on Aug 28, 2018

It is used to recommend treatment options, not diagnosis

VLM · on Aug 28, 2018

Treatment oriented medicine is a problem in and of itself.

There is no treatment for medically diagnosed Celliac disease other than don't eat wheat for the rest of your life; its hard to profit off that in the medical biz, therefore, there being no treatment, mental gymnastics to disprove anything treatable are mandatory across several pediatricians and gastroenterologists and oncologists we talked to for a year or so WRT our son. Finally after ruling out child abuse and cancer, almost out of desperation, they did a quick blood antibody test leading to a stomach cilia biopsy and they're like "sorry its not good news like treatable cancer, its untreatable gluten allergy". Somehow the poor little guy must have given blood for a hundred tests before the antibody test came back with higher levels than the ENT had ever seen in a test result, and the biopsy confirmed it. Remove wheat from his diet and in about a week he was the healthiest kid ever.

The insane part is about one percent of the population is somewhere along the Celiac spectrum of symptoms; my kid somewhat along the extreme edge, most not so bad. But docs will do anything to avoid a simple blood test / biopsy diagnosis because there's no profitable treatment or pill to push.

Somehow, almost accidentally or unintentionally, western medicine in the USA can provide good outcomes, but the average citizen playing the game doesn't understand that its primarily a legal CYA game secondarily a treatment commissioned salesmen game, also a money making racket, and only a distant minor goal is actually helping people live better lives. If people are helped by the medical establishment its almost by accident as a side effect, helping people is definitely not the primary purpose of the establishment.

opportune · on Aug 28, 2018

That’s just liability-related verbiage

opo · on Aug 28, 2018

>...Anyone who can build an ML model to catch these clues is going to make billions.

Yea you would think so... Unfortunately, in medicine, just because a computer program is better then a Dr at diagnosing a patient is no guarantee it will be used. The classic example here was the MYCIN expert system developed in the 1970s. MYCIN was shown to outperform infectious disease experts by 1979 in a blind test:

>... Eight independent evaluators with special expertise in the management of meningitis compared MYCIN's choice of antimicrobials with the choices of nine human prescribers for ten test cases of meningitis. MYCIN received an acceptability rating of 65% by the evaluators; the corresponding ratings for acceptability of the regimen prescribed by the five faculty specialists ranged from 42.5% to 62.5%. The system never failed to cover a treatable pathogen while demonstrating efficiency in minimizing the number of antimicrobials prescribed.

https://jamanetwork.com/journals/jama/article-abstract/36660...

https://en.wikipedia.org/wiki/Mycin

If MYCIN hadn't been rejected by the medical profession, I am sure by now that we would have developed software to assist Doctors in all areas of medicine...

epmaybe · on Aug 28, 2018

Someone else in the comments made a good point about capturing the common words patients tell providers and being able to parse that. If you could transform what a patient is telling you from their words into a concise set of standardized symptoms, then you have essentially done what any other provider does when they obtain a focused history. That could allow a model to prompt the specific questions needed, or request the provider to look for pertinent physical exam findings.

On the other hand (maybe easier?), I think we could also improve the use of some of our less costly and non-harmful techniques, such as ultrasound, to improve our screening.