Hacker News new | past | comments | ask | show | jobs | submit login
Google Med-Palm M: Towards Generalist Biomedical AI (arxiv.org)
110 points by panabee on July 27, 2023 | hide | past | favorite | 87 comments



Having spent some time with a hallucinating ChatGPT and having spent some time with doctors during my life, at this point, in my humble opinion, it should be made illegal for a doctor to make a diagnosis without consulting an LLM finetuned with all the medical research and literature available.

Ah, but copyright/patents/IP. Well, IP was created to foster production of useful immaterial stuff. If you now want to use it to hinder production of useful immaterial stuff, you can go to f*ck yourself if you ask me.

Ah, but lawyers and liability. I propose only that the doctor is required to consult the LLM. Easy to log and verify. All liability stays at the doctor who makes the final diagnosis.


I'm a physician and use chatGPT extensively for coding, writing, and general knowledge inquiry.

With 60-70% correct rates on most training sets and 0.63 critical errors per report, for any physician not very well-versed with the limitations of LLMs, this is more of a liability than an asset. Some of the biggest barriers to care are cognitive, such as anchoring or availability biases. LLMs in their current state will only muddy the water.

Good physicians already know and do use these tools, bad ones will only get worse. A legal mandate will not benefit care.

Doubtless these models will progress to where this calculus will change. The only benefit from a mandate now that I can foresee is to accelerate fine-tuning by forcing widespread reinforcement learning by physicians, but that is a different discussion.


> it should be made illegal for a doctor to make a diagnosis without consulting an LLM

> All liability stays at the doctor who makes the final diagnosis

Sorry, what? You’d force clinicians to use a specific technology (a specific “how” for finding their answer) and also make them liable for the correctness of that answer?

You seem to have a strange idea about the law if in the same breath as you make something illegal, you sigh with exasperation at lawyers and liability.


Peak SV techbro moment.


I feel like this is unfair, because Epic Systems could recommend ICD10 codes based on the listed symptoms. The doctor would pick the final diagnosis code from the recommended list or their own. That doesn't mean Epic shoulders the liability.


What does that mean?


> Sorry, what?

I did not say the doctor would need to trust the answer. Only that they should be required to ask.


What if they ask another doctor instead?

What if the case is so straightforward (you've see thousands of these, hundreds a year, the entire system is built around them) that you know the diagnosis in less than the blink of an eye?

What if it's emergent, and you have no time to think, like a major hemorrhage? Not only is it obvious, but you must act now, right now?

What if there is a highly studied, routinized process (e.g. cardiac arrest) where you're managing a team going through the diagnostic procedure and treatment, which, over decades, have become a carefully interleaved dance performed at stacatto pace, and, again, there is no time to consult an LLM?

What if? What if? What if?

Are you so certain?


Ask an LLM unless the time it would take to ask would risk significant harm.

That seems to cover all your cases?


> That seems to cover all your cases?

It doesn't. LLMs are trained on literature. Women and people of color are severely underrepresented in medical literature.

E.g., patients of color are rarely selected for clinical trials etc.


We should require software engineers to do the same. So much garbage code I've reviewed that would have easily been resolved had the SE just "asked an LLM".

Maybe we can legislate this into existence as well?


The problem is that it is a short trip from a requirement like that to a required inquiry for those that ignore the answer.

Probably most of the time when they ignore it, they'd be right. After all, neither the NN nor themselves are right 100% of the time.

But is that other case a lawsuit risk? And we've developed into a very risk averse society.


A bit blunt way to word it (and you’re going to get a lot of pushback), but overall I agree with the sentiment. The medical field, due to various regulations and special interests, is quite likely the field of study where humanity is most behind where it could be based on the technology we currently have.

I’m a data scientist, and it stuns me the way in which diagnoses are made compared to how they could be made if we had a large worldwide dataset of symptoms and other observations to draw correlations from. Especially with regard to preventative medicine.

Not only that, but there are a lot of bad doctors out there. If you go to four different doctors for an even slightly obscure problem, there’s a good chance you will get four different diagnoses. If we applied rigorous statistical tests to the assessments made in the medical industry, I think everyone would be unsettled at how inconsistent and irreproducible everything is (as applied to the medical practice—not necessarily academic medical research).


Insurance reimbursement rates already force doctors to spend too little time treating each patient.

I'm concerned that misusing LLMs would allow even shorter consultations, and that would become enshrined in reimbursement rates.


This is peak silicon Valley hubris.

Because the parent commenter "spent some time" with ChatGPT and doctors, we should change our entire paradigm of modern medical care carefully refined and honed over 500 years.

Yeah bro, you know better than all of modern medical science because you played around with chat gpt for an afternoon.


For someone who's "been around doctors" and "used ChatGPT", your opinion is anything but humble.


I feel “require” is a bit strong. But, I think there are some interesting possibilities.

Advertising your medical practice as an “LLM-consulting” (better name needed) one in the same way there’s “Montessori” schools could be interesting.

Another option I think that would be interesting would be making the LLM patient-facing and required as one of the check-in “docs”. And then attach the results to the patient’s file for the medical professional to view.

Could also be great for pre-screening and/or suggesting a virtual visit if appropriate.


I think it'd be challenging to force anything without unwanted side effects, but there's a lot of merit to this.

Generalists tend to lack the specialized knowledge to diagnose certain things even when the evidence would be clear to a specialist.

NN based approaches have the potential to bridge a real gap here.


I really wish people would use terms like "unjustified false statement" rather than "hallucination", which connotes perception that doesn't exist.


Or just “random sample conditioned on the prompt.” How did we get to the point where we started calling random samples hallucinations.


Great, we already have issues with diffusing responsibility into opaque algorithms, now we've got people cheering a full speed run into it.

We've sure come a long way from "a computer cannot be held accountable, therefore a computer must never make a management decision".


Utility here depends on what type of practice it is. ED doc on no sleep and only 10 minutes for a patient, maybe. But it would be useless in something like psychiatry


you have no idea what you are talking about


It is still not quite at a human level (radiologists prefer human reports in 60% of cases in blinded trials on the best of 3 models, corresponding to an ELO difference of 67 points; the worst model has an ELO difference of 130 compared to humans).

The evaluation was done on 246 X-rays, which is good; it would be better if they were not all chest X-rays (to see generality), and if there were more than only four radiologists from the same country.

The achieved 0.25 clinically-significant errors is impressive, although it is only for the best of 3 models, which can incur bias; averaged across models, it is 0.27, a bit worse than human error. Additionally, I am wondering where they get the human baseline; they state:

> These results are on par with human baselines from prior work [14]

but the citation[1] doesn’t give data in the same format (and its format is honestly better: it indicates that humans make no urgent errors or worse in 64% of reports).

Surprisingly, there is no improvement with model size; the largest model performs the worst.

[1]: https://arxiv.org/pdf/2303.17579


I talked to a radiologist about it. The big deal for him was more about voice to text translation (whisper) and being able to rework the report using gpt.


I've posted this story before, but AI has diagnosed a problem patient in our clinic, (and now has diagnosed a second problem patient).

Basically no one could figure out what was wrong with the patient despite 2 years of seeing GP + Specialist + strong drugs. In about 5 minutes GPT3 gave my wife 10 possible diagnosis, she looked up the ones she wasnt familiar with, found one she thought matched, and later did the confirmation test: Yes, he had it. No more medicine needed...(Surgery was needed).

The insane part to me was that I later changed the prompt to say: This is the most likely diagnosis, and it got it correct.

The update on this story, we had another patient with the same issue come in, wife knew the symptoms at this point and knocked out another diagnosis.

The GP should be diagnosis this, my wife is in a specialty that doesnt deal with this. This means multiple physicians are missing this relatively common diagnosis. I hope that medical software takes note pages and places possible diagnosis at the top of the page automatically. Given how many of my tech friends are resistant to AI, I imagine healthcare field is way worse.


I assume you and your wife are not in the US. Stateside, you would need a Business Associate Agreement to send protected health information to a third party.


Not always. It is not necessary to have a BAA to look up diagnosis information using patient data, so long as the data used maintains the patient’s anonymity.

For example, a doctor googling your symptoms doesn’t require a BAA with google.


If you properly deidentify the patient information, then yes, you are not sending PHI to Google. Proper deidentification is tricky though: https://www.hhs.gov/hipaa/for-professionals/privacy/special-...


Google's black box ad optimization stuff probably links you with the doctor via your searches for them and then links the doctor's searches about your conditions to you based on your own related searches, or do they have safeguards for this?


If GPT3 was trained on WebMD, 9 out of the 10 would have been a cancer diag.


Model performance doesn't seem to be monotonically increasing with size. The 84B parameter model is the best on most tasks.

What happened to "you just train it on more data and performance goes up exponentially"? There were so many charts proving this and we could even project at what level of data/compute world-conquering superintelligence would inevitably emerge.


Maybe there is simply not enough medical data to properly train a 500B model.


There is text under the table.


They didn't try particularly hard to find out why the biggest model didn't perform as well. Or at all, as far as they report. They give us one sentence speculating "maybe not enough training".

Come on. This is Google. They have unlimited compute resource, biomedical AI is a core strategic objective, they've spent untold $$$ getting data and working on this for years. There are news articles on their efforts going back a decade. And all they could do is wave their hands at "maybe not enough training".


I won't take any of this seriously until they make their model and weights available.

This type of PR research is what is really holding back AI for medical images.


My impression of what's holding back improvements to CV and ML for medical images is that close to none of the research in the field will ever be put into practice. There's a huge amount of research done and then the companies actually producing the software used by technicians and doctors choose a tiny number of things to pick up, incorporate, test, get certified, and it takes years. People working in the field have admitted to me they have few to no hopes.


I work in this field - IMO, your assessment is correct insofar as the absolute need for commercialization of research done. However, we haven't gotten to the super-human performance for medical images yet. We don't even have the ImageNet-equivalent, nor do we have any open source (or just source & weight available) models that Google and other big corp claim to be so wonderful. Most of people in the field, rightly so, then dismiss anything like Med-Palm M and other PR-like papers from these groups. Why base your career (both academically and product-wise) on this?


> "we haven't gotten to the super-human performance for medical images yet."

I guess that at least in France, most people living elsewhere than large towns do not need that.

What they need (IMO) is competent radiologists in a time of medical deserts.

Because even when there is staff, this staff do not spent enough time analyzing images while being quite expensive. Their main goal is to make money quickly.


A couple of years ago I had built a product for health systems in the computer vision space - and ran into the challenge of trying to commercialize it (with no luck). Even if the product/technology is excellent, the regulatory hurdles & red tape make it insanely difficult to get this sort of stuff commercialized in a healthcare context.


What were your gtm learnings


I wonder if these models can be utilized by poorer, under-served communities?

It could be a kind of force-multiplier for the few health professionals they have on hand.


Many people in the field are physicists, and they consistently state that even if the performance is better than human, they will not use ML in the field, simply due to explainability. Physicists like having very explainable analytical equations, so handing them a lot of matrices with how to compose them is not convincing.

They may still lose out in time, but not without a battle over explainability.


"get certified" meaning hospitals can bill for this, which is what will actually get them to adopt something new.


Also, Google is always and constantly making announcements that Bard or whatever IA tech (Music-LM, etc) is supposedly beating all the competitors, but nobody can use or see such models, only very inferiors ones, curious, right ( ͡° ͜ʖ ͡°) ?


The only issue, that they are probably trained on most common data. This includes the fact that FDA and some other major organisations avoid some sort of diseases (parasites, 3.5B people infected, lyme disease and some more)

looking forward on updates



I believe that's called an Xstorm now.


All being said and done with respect to AI in medicine, Doctors and medical professionals will be the very last people to be replaced. Care (both physical and mental) will be the last bastion of humanity disrupted by AI.


Why?

Medical practice is very knowledge intensive, especially general practice. Very prone to human error, have you seen the stats on accidental deaths in hospitals? Excellent high-value use for (future) learning systems which can ingest and apply a large body of knowledge and reason in the face of a huge poorly understood graph of causal factors.

No, the last people to be replaced will be those doing unpaid labour, such as parenting, household chores, unpaid elder care: there's the least economic incentive to do so. So I do agree in large part with your final sentence.


It is technically possible today for patients in hospital to have their food delivered by robot, yet most people would have a strong aversion to that idea: the presence of other humans helps us to have the will to get better.

I choose my GP based on my ability to trust and empathize with him, not because of his grades in medical school.

Medicine is an intrinsically human activity. It may be augmented and improved by AI, but people are still going to want human medics around, it’s just the nature of being sick.


That doesn’t mean it can’t be assisted by technology. A large part of medicine is ordering tests, interpreting results, analyzing imagery, proposing medicine courses, records, pathology, etc. Anesthesiologists are crucial care but basically do their jobs by monitoring signals and adjusting dosing accordingly. That’s like ideal machine skills and no one has a relationship with their anesthesiologist- they’re asleep.

I agree the direct interface to a lot of medicine needs to be a human as we need human assurance when we are sick and scared. Bed side manner can’t be replace to devalued. But we suffer from critical shortages of medical workers, many of whom can be augmented powerfully with knowledge machines.

My understanding is the real barrier is providers themselves find the user interfaces counter intuitive, aren’t afforded time to train on new systems, and are under so much pressure to produce with such a limited staff they can’t spend the time and energy to work these tools into their workflows. Add on to that the regulatory hurdles to innovation, general risk aversion in the development field, complexities of selling unproven tech into unsophisticated hospital network administration, etc, it’s no wonder it’s a slow slog. It’ll take someone like Kaiser really committing material capital, time, resources, and mandating adoption with a significant training program to see any success.


> It is technically possible today for patients in hospital to have their food delivered by robot, yet most people would have a strong aversion to that idea

That's a matter of personal preference. I prefer self-service checkout in the grocery store even if there's an unused staffed checkout. Generally, I prefer robots and AIs wherever they are available.


That's because you have no way to evaluate the efficacy of different GPs. If you knew with certainty there were a superhuman AI that's performing 100x better than human doctors (and costs 100x less), you would absolutely choose the AI doctor over the human one every time. And if you don't, your kids (or grandkids) will.


You should reassess what is important to you.

Empathy isn't going to help diagnose.


I hope I never have to live in a world where I have to convince a united healthcare chat bot that I need pain medicine.


I wish to live in your beautiful country filled with loving doctors doting on you with motherly care, because I definitely would prefer asking a bot for meds.


> Why?

Regulatory arbitrage.


On the contrary, critical scenarios is exactly where tech and AI helps a lot.

You'd think Pilot would be the last type of 'vehicle driver' that would be automated. On the contrary, they were the first to adopt almost 100% self-driving.

> mental

We are finding curious things like 'playing tetris is the most effective way to avoid trauma'. Doctors will help in lab and diagnostic work, but the inference of what solution to administer might actually come from an AI. Ofc, similar to flight, the most challenging situations will still be 'manned' (like fighter jets or non-routine surgery)

Healthcare is inaccessible and understaffed in most parts of the world. If AI doctors can be better than the bottom-10-percentile doctor coming out of a US med-school, then they should be adopted wholesale.


They won't be replaced, but doctors are becoming both more expensive and increasingly harder to get access to, so having a cheap and scalable baseline to diagnose the common problems seems like a good thing?

So far WebMD with its entire algorithm being if(symptom) return "cancer"; has somewhat destroyed all public faith in these sorts of systems, but that doesn't mean it's impossible to get done to a useful degree.


Doctors could be replaced before nurses and care takers


The notion of being replaced won't hold true for most professions affected by AI. The responsibilities will shift to more higher-level.


I wonder if it will be able to decipher doctors’ handwriting though


It's Lupus.


It's never lupus, except when it is.


[flagged]


The Google hate crowd is so strange to me. I just don't understand it when Google are probably the best data stewards in the current landscape. I trust Google more than I trust my own government with my data.

I guess their support is bad, is that why you're so against them?


On the spectrum of how companies and governments handle private data, they're pretty good actually. Just, the bar is so low you're going to need an archeologist to dig it out; Not hard to far exceed said bar when you can just stand at ground level.

But most fundamentally I am against them because they're an advertising company first and foremost, and if there's one thing I trust them to do is to stay an advertising company. That means that without laws that have actual teeth to prevent misuse of data, they're always just a few decisions away from not being such good stewards of that data.

I'm not singling Google out much here though. My opinion would be the same with Meta, Apple, Microsoft..etc. I don't hate Google, I hate the whole damned system.


Alright, that's a fair opinion. It's ads all the way down with the large tech companies.


Better stewards than Apple?


Yeah I think so, although I think they're both very good.

I feel like Google has already established how they use data, whereas I'm not so sure with Apple.


And the tone deaf product cancellations.


Right, I think that's a fair criticism, but I don't see how it relates to their data privacy policies.


They're probably going to do their entry via B2B. Its not like other Healthcare companies don't like to save a buck.


Ah yes let's make sure to burn down Google for the evil they brought to the world like Google maps, Gmail, Android, research like a lot at ai, chrome etc.

I have no clue what 'goodwill' you even mean.

Google is a company and always has been.

And yes Google is one of the FEW even being able to help the whole society to progress in this field which will save millions of life.

But would be great to burn this do to some weird opinion you have about Google...


I am pretty sure they already had their hands at NHS data, thats about 67 million people in the UK. A bit late for that.


It was anonymised by an external company before it reached Google. I know one of the people who did it, if he was getting kickbacks he'd have a nicer car.


[flagged]


Hacker News is for hackers, who can be of any political affiliation. It might be hosted by Y Combinator, but that certainly doesn't mean it is Y Combinator.


They should name it Dr House AI so that boomers get it


"We're sorry, the Google Critical Needs Patient Healthcare plan you are on has been terminated."

That's pretty much all we can expect from Google being involved in health care. Their entire business is built upon not caring about human lives.


From the abstract:

> Generalist biomedical artificial intelligence (AI) systems that flexibly encode, integrate, and interpret this data at scale can potentially enable impactful applications ranging from scientific discovery to care delivery.

The problem with AI models is it may bring us medical innovations that are actually harmful to humanity, such as increased lifespans, which in turn will drive further overpopulation.


Too bad there are already millions of people working in medicine. Over the past 200 years they have consistently added about 3 months to average lifespan every year. It's too late doomer!!


You don't think they may also bring us transportation, agricultural and logistical innovations that will make increasing the population a non-issue?

If we actually reach a point of technological singularity, it may well be the case that we are living amongst the stars within 10 years. Nobody knows. I wouldn't worry too much.


An increasing population is already an issue. Perhaps not for humans, but for non-human life. More transportation and agriculture means more ways to get to new locations, which in turn means the eradication of the natural environment there. Of course, we could follow the advice of E.O. Wilson and preserve half the earth for other life-forms, but I have little trust that we could do that, especially in developing countries.

No, new developments in transportation are unlikely to solve anything, except give some people more space in the short-term, and cause even more habitat destruction. New efficiencies in agriculture will help in the short term, but then make producing new humans even more efficient, causing even more destruction.

As for your predictions that we will be living in the stars, that's depressing. On spaceships and barren planets? Think terraforming will help? We can't even take care of one of the easiest planets on the solar system, earth.


Any AI worth its salt would not enable the proliferation of such puny intelligences as humans.


I know you're trolling, but don't worry AI would have the answer to this solution too!! Don't you get it?


I am not trolling. I genuinely believe that AI will bring more harm than good to humanity and I am trying to point out the specific ways it will be whenever new AI developments come about. Surely to every new development there are also cons or negatives as well as positives?


> AI will bring more harm than good to humanity and I am trying to point out the specific ways it will be whenever new AI developments come about

And what you managed to come up with is... people will live longer! That's just hilarious :-D




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: