Hacker News new | past | comments | ask | show | jobs | submit login
Deep learning on electronic medical records is doomed to fail (moderndescartes.com)
227 points by brilee on March 22, 2022 | hide | past | favorite | 141 comments



Having worked with data from EMR systems and having worked at a large EMR software development shop myself, and now using deep learning at work quite a bit for the past few years, I'm inclined to agree.

This title is somewhat click bait though, because the fault is really with EMR systems and (esp) the American Healthcare system, not deep learning.

The entire system is designed around billing and decisions are made my hospital and insurance executives that are generally not technical. There is no incentive to clean up the system or work on a well structured open protocol for interop the same way there is in say banking. Plus, the author gives some good examples like pulse ox%, doctors and nurses are not at all concerned with or trained to record data in a way that makes sense to use programmatically. They're typically thinking only as if they're recording it for another human to read.

Deep learning could probably be quite useful in the medical field, but we won't know until someone comes along and disrupts the system top to bottom similar to how Tesla has done with not only manufacturing, but the sales process and shirking the dealership model. This would probably look something like Forward[1], but with a crazy amount of funding, so that insurance companies and billing codes could be ignored entirely.

[1]: https://goforward.com/p/home


Yes, exactly. Hospital data is atrocious, on every single level. Duplicate data are everywhere across multiple systems. Hospitals move excruciatingly slow to do anything technical. And, very few people seem to ever have any real understanding of what is going on with their systems, they tend to only know the user interface and have to rely on their vendor support (Epic, Cerner, etc) for anything beyond that.

I work for a company that I was hoping would be such a disruption point for hospitals (at least in some small way), but instead they decided that it's just too difficult to get hospitals to do much of anything, so we effectively knuckle under -- creating numerous integration points, making more and more copies of data.

The only way this will change is if a large enough player creates direct competition in the EHR/EMR business, with these kinds of data-oriented models in mind, all the while creating a system that, top-to-bottom, is better for both the user, the administrator, and the technical staff. Current players in this space have very little incentive to make their products better. And it reminds me of a quote from Tron Legacy: "Given the prices we charge to students and schools, what sort of improvements have been made in Flynn... I mean, um, ENCOM OS-12?"..."This year we put a "12" on the box."


It would take billions of dollars and many years to build a new general purpose EHR from scratch. This isn't an industry where a start-up can launch an MVP and get sales to early adopters looking for a competitive advantage. Every major provider organization already has an EHR, and usually an ONC certified one. A new offering has to be at least as good as existing products in every way in order to get any sales. It has to be a 100% solution out of the gate.

Outside of general purpose comprehensive EHRs there is still opportunity for new niche market offerings targeting limited medical specialties or types of facilities. For example if you wanted to target, let's say, just dialysis centers or just psychiatrists then the hurdle would be much lower.


There are several different open-source EHRs... OpenEMR and OpenVista come to mind. You wouldn't have to build one from scratch.


Sure but those existing open source EHRs aren't any better for users than the commercial alternatives. In fact they're mostly worse. OpenEMR is only suitable for small organizations delivering outpatient care, not for large hospital chains. OpenVista has a huge amount of technical debt and isn't getting much new adoption. Neither product has the kind of underlying data model that @ulkesh suggested above. So I think anyone looking to build a truly disruptive product would have to start almost from scratch.


A bit of inspired legislation could solve all of these problems.


Sure. That's where the first few billions of dollars and several years are going to have to go.


And would you trust anyone willing and able to go through that process? The lobbying and political game is filled with sociopaths that would sell out the public for a hot lunch.


Not sure if you are being sarcastic, but that's precisely what the ACA did. It created giant incentives for hospitals to modernize their medical record systems, which many did.


The parent's comment isn't about hospitals sending more money to Epic.


Part of the challenge might be the limited number of individuals with expertise in both medicine and technology. Each of those fields are quite specialized and require time to build an understanding of the subject matter.


"This title is somewhat click bait though, because the fault is really with EMR systems".

The title is entirely accurate: it specifies "on electronic medical records" and effectively demonstrates the thesis in the title. And yes, you're right that the problem is in the data. No matter, it's still true.

Everything with an even mildly provocative title is accused of being "clickbait" on HN. I wish people would reserve that criticism for actively deceptive titles, or titles that promise something that is not delivered.


> Plus, the author gives some good examples like pulse ox%, doctors and nurses are not at all concerned with or trained to record data in a way that makes sense to use programmatically. They're typically thinking only as if they're recording it for another human to read.

The unstructured doctor's notes may be where the best signal lies. Thet should be mostly uncorrupted by the billing/process related trash. Deep learning does not need data to be formatted to be read "programmatically", in fact it shines most on data that isn't.


Deep learning requires very well curated and balanced datasets, at best if your test data is also poorly curated and similarly unbalanced that might hide that it isn't working from you.


> The unstructured doctor's notes may be where the best signal lies.

Or where the raw data is corrupted ...


Doctors notes: removed 2 polyps Billing: unremarkable endoscopy

Which is true? (Based on a true story)


They're both true of course. Polyps get removed all the time, it is unremarkable in a medical sense, and probably not enough to even warrant its own billing code, otherwise it certainly would have been listed.


No this is not true at all. Procedural documentation in the US is exquisitely detailed both for medical-legal reasons and billing. Any polyp in the colon will be grossly described, it’s approximate position and if any intervention or send out testing is being performed.

And as an internist - no colon polyp is medically unremarkable.


Ah yeah, that’s good point. In either case, the true story ended with no polyps and it was a transcription error.


>>This title is somewhat click bait though, because the fault is really with EMR systems and (esp) the American Healthcare system, not deep learning.

Right

Data-mush is the cause

That's my official term for it. Data that looks fantastic in format, but just the slightest peek under the covers reveals that every data-entry person is either freelancing and entering whatever is easiest/sorta-makes-sense for them, or under pressure to skew things, or the definitions of each code (and the cases where it does/doesn't apply) are ill-defined, or etc. etc. etc. and on top of this, we have the medical system insurer-provider adversarial relationship covered so well in the article.

The result is a toxic brew of definition drift, unintentional errors, and intentional errors. It is not just the fringes and sub-one-percent of the edge cases, it is rotten to the core.

Basically the entire data set is a complete lie.

You may have the most perfect AI, but it literally has no chance against that toxic data swamp.

Once again, the only winning move is to not play.


Wouldn’t heath systems like Kaiser Permanente that are fully integrated (both insurer and provider) have the right incentives to disrupt their own processes in favor of something better?


What sort of incentive were you thinking of? KP is one of the largest Epic customers. They don't have claims as such, but still have similar internal processes to prevent waste and determine how much to charge their customers.


I was thinking about incentives around cost of care.

KP could switch from Epic to an alternative if it enables deep learning that reduces costs in the long run. Reducing costs is aligned with KP’s incentives because they are not only the provider but also the payer for medical care. This is different from most medical providers which OP suggests have an incentive to be “cost maximizers” to increase reimbursements from payers.


> KP is one of the largest Epic customers. They don't have claims as such

Yes, they do (whether you are talking about KP-as-provider-network submitting claims to non-KP insurers, or KP-as-insurer adjudicating claims from non-KP providers.)


I was addressing the comment from @divbzero about integrated systems. Out of network claims are a separate issue.


Having worked with otherwise decent Swedish public healthcare systems, I can't tell you if its as bad, just that it is absolutely awful. I would never think to try and use anything they provided as basis for a dataset.


> doctors and nurses are not at all concerned with or trained to record data in a way that makes sense to use programmatically. They're typically thinking only as if they're recording it for another human to read.

The flip side of this is that data entered for programmatic reading often isn't very useful for future humans to review. Automatically-generated doctor's notes obscure information in a way that "HEART ATTACK" circled in red ink on a paper chart does not.


Implementation guides built on HL7 FHIR are well structured open protocols for interoperability. Several of the largest EHR vendors actively participate in defining those standards, and have built them into products.

Of course that doesn't solve the garbage in / garbage out problem. If users configure the software incorrectly or don't enter the right data then then you don't have much to work with.

In some cases NLP technology can be used to convert unstructured chart notes into coded clinical concepts. That can work well enough for research and analytics, but isn't accurate enough to use for direct patient care without human review.


HL7. Well-structured. Pick one. The HL7 spec might be a standard, with a specification, but it is not a thing for well-structured data.

For example, FHIR allows you to use a standardised shorthand, with a formal grammar. [0] That grammar is well defined... It also happens not be a regular grammar, and it is entirely possible to construct something that will be both valid and undecidable, ala Perl. Infinite parsing and expansion on finite data.

[0] https://hl7.org/fhir/uv/shorthand/STU1/reference.html#append...


Yes, think of HL7 like XML or JSON. It's a grammar/syntax, it does provide some domain-specific conceptual mapping but you can code the same thing lots of different ways. Emitting a well-formed HL7 message doesn't mean that anybody can actually ingest them.

There's also a fair amount of instance-specific coding and going to/from that is always a pain.


I wouldn't think of HL7 like XML or JSON, it's higher level than that. HL7 is a standards development organization which publishes multiple separate major standards. Those standards define data models which can be encoded in various different wire formats. V2 Messaging can be encoded as ER7 or (rarely) XML. CDA can only be encoded as XML. FHIR can be encoded as JSON or XML.

If you want recipients to be able to ingest your messages or documents then it's crucial for the sender and recipient to mutually agree to conform to a specific versioned Implementation Guide. If both parties actually follow the IG then it eliminates the need for instance specific coding, but of course some organizations still have quality problems.


That isn't an accurate description of HL7 standards. Everyone in the industry understands that the baseline standards (V2 Messaging, CDA, FHIR) are just construction kits. In order to achieve real world interoperability you need an Implementation Guide which profiles and constrains the baseline standard to clearly document the exact interactions and data structures (required data elements, cardinality, code system bindings, extensions). There are dozens of published IGs so find one that fits your use case, or get involved with an HL7 Work Group or FHIR Accelerator and help write a new IG. Anyone can participate and as ANSI accredited organization the process is completely open.

I don't understand your concern with the possibility of a FHIR Shorthand expression being undecidable. This just isn't a problem in any real world system. But if you know a better approach then formal proposals to improve the standard are always welcome and will be seriously considered.


> I don't understand your concern with the possibility of a FHIR Shorthand expression being undecidable. This just isn't a problem in any real world system.

My discovery of the format being undecidable came with the experience of building a system designed to ingest and interop data for an entire state (Victoria) during the construction of Australia's MyHealthRecord [0], which I would classify as a "real world system".

It was just one of the many edge cases I found with the standards, and led me to believe that no, it isn't possible to normalize the data that these standards are supposed to encode. And no, there is no way in hell that you can get many thousands of practitioners all to follow a single IG.

It's such a lovely set of standards that MHR developed the HIPS Middleware Standard [1] to sit between all HL7 documents and the developer interfaces of MHR.

[0] https://www.myhealthrecord.gov.au/

[1] https://developer.digitalhealth.gov.au/specifications/implem...


Just because it's possible in theory to construct an undecidable FHIR Shorthand expression doesn't mean that's a real problem in practice. Could you give us some specific details about how that issue impacted MyHealthRecord? Have you created a Jira issue and brought it up to the FHIR Infrastructure Work Group, or registered a negative on the latest IG ballot? I'm sure they'd like to know and would try to find a solution.

http://hl7.org/fhir/uv/shorthand/history.html

http://www.hl7.org/Special/committees/fiwg/index.cfm

A few edge cases are really not a serious concern. There are many organizations using these standards. In fact in the USA we're getting thousands of practitioners to follow a single IG right now because they're mandated to by federal regulation. Of course there will be some technical challenges and defects but those will be gradually resolved over time. And in practice there won't be thousands of independent implementations; instead a few vendors will implement the specifications and then the provider organizations will use that software.

HIPS appears to be a tool for doing patient index matching, and for converting between legacy HL7 V2 Messaging and CDA formats. That's useful functionality, but doesn't indicate any sort of problem with FHIR.

Instead of complaining about the standards I would encourage you to get actively involved in improving them. The standards development process is totally open and transparent, and the work groups welcome new members who want to contribute in a positive way.


Yeah... This kind of overly defensive and dismissive response is precisely the reason that HIPS exists, and why MHR gave up on speaking with the standards bodies. The complaints by a federal project are a serious concern, and so are edge cases that they hit, because HL7 seems to have more edges than a razor blade.

MHR is five years old. These problems do not get resolved over time. An undecidable FHIR is a problem in practice, because that's where it was discovered to be a problem. It's not a defect that gets resolved because a few vendors implement software that gets widely used - it's a defect because it'll get resolved differently between those vendors. A data format should be a regular language.

HIPS exists so that the MHR devs don't have to see a single HL7 document of any kind - because the formats are awful, in every possible way. I'm sorry, but I have nothing good to say about working with HL7, both the standards and the working groups.


Yup. After a few years of trying to follow the rules, I just decided to use data scrapping techniques.


I designed, implemented, and supported 5 regional exchanges. Interop is always case-by-case.

Protocols like HL7 (2.x, 3.x, FHIR) give you shared syntax (lexing) and about 98% agreement on structure and semantics (parsing). Then you'll spend most time on the last 2% on excruciating details. Data quality on field values, matching units, mapping one taxonomy to another (and back).

> If users configure the software incorrectly...

Based on my experience, incompatibility is a given. Regardless of the cause. Misconfiguration, mismatched versions, ignorance, artistic license, etc.

Plan for the worse, hope for the best.


So does that mean it might be of use to non American healthcare systems which have less to deal with billing?


The first point he raises is the most critical by far. The silverbacks of the industry deliberately stymie efforts for true interoperability because it goes directly against their primary goal, which is forcing everyone into their platform. Epic in particular has zero intention of allowing anyone else to take their market share by enabling easy sharing of data across platforms. It's far better for them from a business perspective to make interfacing so unreasonably difficult that you are forced to implement their full suite of applications, at which point they hold your organization's data hostage to induce other orgs to do the same. The larger their ecosystem grows, the less they need to worry about interoperability - improving patients' outcomes is not even an afterthought. Their vision of population health reporting is one in which every major healthcare org has been trapped inside their walled garden.


Some free advice from an ex-Epic: This is true when it's other vendors doing the data fetching. When it's a health system customer of Epic, they bend over backwards to help them extract the data properly and build cool clinical tools on top of the Epic platform. Health systems with big innovation arms like Atrium and Providence could be a good place to seek VC if your product idea relies on deep EMR access. Sometimes the left hand doesn't talk to the right in these health systems though - you'll need to get that innovation arm talking to the EMR analysts. Use the shibboleth "I want to talk to our Epic TS" for whatever speciality you work on.

As for things like the App Orchard and Epic on FHIR https://fhir.epic.com/ : Epic is smart enough to realize that their future lies as the platform of the health system IT stack, in the Ben Thompson sense of a platform / aggregator.

The hospitals are scared of open access, and Epic always does what's in the best interest of their customers, so they push against open access.


Epic on FHIR is missing a ton of data that is present in Epic, for example if I want to react to a dispensing event, that has to be custom built as far as I know. Even then, you're long polling an endpoint hoping you find work you need to react to.

At times it feels like it would be better to just get data straight from the database, as custom Epic implementation times are insanely long and costly.


I don't know about dispensing events specifically, but Epic supports CDS Hooks to allow for triggering custom code in response to certain events.

Getting data straight from an EHR's underlying database is risky. The vendors generally don't document or support this, and the schema could change at any time.


Well, you don't end up having to query their DB directly, you can do it via standardized IHE ITI XDS/XCA over a network like CareQuality. There are startups like Particle that can do it as a SaaS.


Have you used Particle? I find the concept really interesting and could help me in the future. Would love a end user opinion.


Full disclosure, I was an early employee and wrote the first integrations, so I'm likely not the right person to ask. https://www.particlehealth.com/ should have details.


Thanks for sharing. Can you please suggest some resources for further reading? Is particle.io the startup you mentioned?


https://www.particlehealth.com/ -- full disclosure, I was an early employee


Ehh, Epic cares about Epic. They lose their gatekeeping ability once TEFCA roles out with teeth.

The P in HIPAA stands for portability, of course.


Yeah, ex-Epic here as well. Epic's data surfaces are byzantine almost by design, as a form of security or safety. It can only be construed as gatekeeping in that Epic has never prioritized making it better for non-customers.


Epic controls 55% of the EMR market, and that number is only growing. It won't be long until this isn't a problem because the majority of the population has all their records with Epic.


Yes, EHR vendors are really opponents/gatekeepers here. They don't benefit from you getting your data out of the EHR and in my experience they weren't really open to it.

Interoperability is really driven by either state/federal reporting requirements, or billing. And there is no incentive for EHR vendors or their client hospital systems to go beyond the exact minimum to get paid.


Epic has an extensive set of interoperability APIs, mostly based on open industry standards, which enable easy sharing of data across platforms. Is there something specific missing?

https://open.epic.com/


Pardon my bluntness but this is a bullshit brochure website.

I've worked with non-health system healthcare companies that have tried to work with Epic on interoperability. They are outright hostile to anyone being able to access any data in their systems unless it's the hospital customer (and even then they're not exactly helpful).


Could you be more specific about what part of the website is bullshit? I've used some of those APIs and they worked as documented.

Vendor hostility is only an issue if you actually need to work directly with the vendor. Most of them have no incentive to help other developers who aren't their customers. Why would they? If you're doing something new that requires active Epic cooperation then it's best to partner with a major Epic customer, and put a formal legal agreement in place.


> Life would be simpler if only these hospitals could set aside their arrogance and just go with the recommended workflow!

This would be like asking programmers to standardize on the recommended programming language.

we would love to just use the recommended workflow, if it worked for our hospital. There are differences in the patients, doctors, local regulations, existing systems, etc between hospitals.

Patients: Top cancer hospital does a lot of clinical trials, so some of the forms require you to fill out clinical trial information for every patient. In a maternity ward, it would not be appropriate to ask about clinical trials for every patient.

Doctors: Hospitals are staffed differently. If the hospital has residents, some of the work can be delegated to residents. If not, someone else has to do it. The workflow needs to account for who is actually available to do the work.

Local regulations: Medicine is highly regulated, and each state and hospital has its own rules.

Existing systems: Hospital computer systems have been around for decades, and usually it's not possible to migrate everything to a new system, so the new system needs to integrate with the old systems that couldn't be upgraded.


I think the rest of the paragraph clarifies they are joking.


I happen to know a lot of doctors, including, as an example, an OBGYN. As it was explained to me, for vaginal births:

- at some point someone, without evidence, speculated that cervix dilation should proceed along some curve

- cervix dilation is actually measured by hand - literally inserting fingers and having the doctor practice "so many fingers = so many centimeters". There's plastic sheets with holes in them so they can practice measuring the size of holes with fingers.

- the OBGYN knows that the cervix dilation curve should look like, and kinda sorta maps their hand readings to what it should look like

- the OBGYN has a general sense as to how labor is going, and will game the cervix dilation stats to match their expectation, e.g. if labor is going well but the cervix hasn't dilated then they'll kinda sorta report progress anyway

Anyway, given the above it seems like the data around cervix dilation is suspect - the measurements are fitted to what the curve should look like, and then the data matches the curve, and that makes people more confident in the curve, and so on.

The point is, can you really apply ML to the EMR of cervix dilation? Does it make sense, could you really draw conclusions from this?


>> The point is, can you really apply ML to the EMR of cervix dilation?

Yes, it's a perfect fit. If I may.

Sorry, to clarify, the way most people do machine learning is what you describe: tweak a model until it fits the dataset. If the ability of the model to fit the dataset translates to anything beyond that, it's anybody's guess.

You just put me off being a parent for life, btw.


Just to clarify, this OB will report incorrect clinical data to support how they feel labor is going?


In cases of subjective data, you will always see variances in reporting that may be construed as misrepresentation. It's not at all easy. As "objective" as the example might seem, I'd argue the clinician has too much leeway to truly be objective. There's a whole ton of subjective data involved in every patient course of care, much more than objective in many cases.


In some situations yes, in others, if they read a 5.5 or a 6, then they will pick the one that fits.


>EMR software is widely hated by the nurses and doctors who have to use it. It’s slow, bloated, nonintuitive, requires workarounds, etc. etc. etc.. The root of this evil is that every hospital brings its own conceited and byzantine patchwork of procedures, checks, and rituals to the table.

You just described every admin software for every industry I've worked on.

The problem is individual orgs dictating software architecture. When each purchase is in the millions, you accommodate every whim no matter how absurd... and then you end up with these bloated, messy systems.

For software systems to REALLY, shockingly improve efficiency in an organization, all the processes in the org need to change to accommodate a new overarching system design. Tailoring software to mirror legacy processes defeats the purpose almost entirely.

I think there is a truly absurd competitive advantage in doing this right but you seldom see enough leverage to completely overhaul every department in order to implement software admin systems.


People don't hate Jira because of Jira, they hate the experience of using Jira with the rules that their organization has put into Jira. They hate the culture of how their org uses Jira.


There's a lot of truth to that, but Jira (or more accurately Atlassian) does things that are worthy of hate too.

- Some information is displayed in unexpected places

- The application can be slow to load

- Customizations are confusing (Do i need to edit the screen, the screen scheme, or the issue type screen scheme?)

- Confusing options (Field configurations have both a "Configure" action and an "Edit" action.)

- Basic features are often left out and require plugins to bridge the gap.

And much more!


Fair. I still think MOST of the hate is due to configuration and culture.


The article is simply wrong.

I know this because I worked as an ML engineer at an extremely successful company that automated medical coding using deep learning.

The confusion stems from conflating a "perfect solution" with a "human augmented" one.

90% of coding cases are trivial, have low value and can be done by a model. 10% are really subtle and need human expertise.

That's fine. You can make a billion dollar company on low hanging fruit. I think it's best not to conflate the perfect solution with a very good solution.


You've not refuted the article so much as pointed out a corner case the author didn't address in which ML is a good fit. Your example, using ML to perform the medical coding function, is using a data source (in this case the EMR) for one of the purposes for which it was explicitly designed and for which it is (arguably) non-deficient. That is a realm not doomed to failure.

The realm doomed to failure is using a data source for a completely oblique purpose for which it is horribly distorted. Namely, the purpose of optimizing individual and public health by discovering guidelines and treatments, diagnosing illness, and delivering optimal care.

(Of course medical billing as an enterprise shouldn't even exist, but that is another topic.)


Thanks for the nuance. I completely agree with how you've framed the situation.


Medical coding is just billing right? You match doctor notes to ICD-10 codes. Seems reasonable.


Medical coding is mainly billing with ICD-10 codes for diagnoses and CPT + HCPCS codes for procedures. However, there is also non-billing clinical coding for things like LOINC, SNOMED CT, and RxNorm.


What was was the criterion that you were optimizing? Currently hospitals try to assign codes in a way that maximizes payouts from insurance companies while avoiding straight up lying in a way that could cause them problems. So they'll handle that 10% by choosing the codes with the bigger payout.


I agree with the conclusion. This is totally unsurprising to me as a ML engineer. If you put garbage data into the model, you get garbage predictions. That doesn’t strike me as particularly novel. The same is true for cooking, after all.

However- this has been truly shocking to all of the non-technical stakeholders I’ve worked with. They take the stance that any large amount of data can be used to do ML on, presumably because they don’t know too much about what doing ML is like.

So I’m convinced the author is right, and I’m also convinced that there will be many attempts to use ML on EMRs.


There is the old quote that we've all seen: "On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question." - Charles Babbage

I will say, I have some good news for the late great Charles Babbage. For the most part, people now do indeed understand that if you put small amounts of wrong figures into a machine, the wrong answers will come out. If nothing else, pocket calculators and math class have given them the direct experience of this.

However, it seems that people still expect that if you put gigabytes or petabytes worth of wrong figures into a machine that somehow the right answer will pop out.

Ah well. The road never ends, you know.


> However, it seems that people still expect that if you put gigabytes or petabytes worth of wrong figures into a machine that somehow the right answer will pop out.

The interesting fact is that if you put in lots of correct figures, and only one, slightly wrong figure, then the answer may be correct to an acceptable degree.

For example, take 100 values, all exactly one, and find the mean. You'll find 1.

If you take 99 values of exactly one and one value of two, the mean of your sample will be 1.01, which is close enough to still be useful. In some interpretations, it may even be rounded to 1, meaning that in some circumstances, incorrect figures can indeed sometimes lead to correct answers.

Or if you're trying to find out what adding 1 and 4 gives you, but accidentally you add 2 and 3, you get the correct answer despite incorrect inputs.

I think people are assuming that if you put gigabytes or petabytes worth of data into a machine, the number of 'wrong figures' will be lost as noise.


the problem is that for medical coding, this translates to "a small number of procedures will be coded wrong", and that's not a meaningfully better situation than "a small number of procedures can't be coded", and in fact is probably worse. So you need a reasonably high confidence threshold, and really in most cases you probably want to have a human manually review the problem (or questionable) cases.


Garbage in -> Garbage out is basically a Newtonian law at this point haha


That's not entirely true. Neural networks are fairly robust to noisy training data (a.k.a. garbage).[0] Well, stochastic gradient descent has the noise in its name. More training data can compensate for noisy data to some extent.[1] I'm not sure know if model size can also compensate for noisy data though, but would not be surprised if it did.

[0] https://arxiv.org/abs/1705.10694

[1] https://arxiv.org/abs/2202.01994


There are very specific conditions for this to hold, mostly that the incorrect sample is surrounded by correct ones, and that the model is small enough or the error vanishingly rare. Notably the reference you gave also shows horrendous generalization performance, so its really just showing how easy it is to overparametrize. Input errors can be accounted for to some extent, but also under specific circumstances, eg.

https://openaccess.thecvf.com/content_CVPR_2020/html/Eldesok...


I mean the argument there is basically if there is enough good quality data, the bad data is somewhat (or mostly) compensated for. To which I would argue that is no longer a “garbage in garbage out” situation as most people use it.


While a lot of the post is good info, there are some upsides, if we can get the EHR situation worked out.

Many years ago, prior to anything like ML, Canada figured out that cystic fibrosis patients whose weight is higher than 50th percentile, had significantly better lung function. Nobody really understood why, but the correlation was so strong (.85 or something like that) it could not be ignored. Treatment protocols for CF changed to encourage weight gain, and lifespan outcomes have steadily improved over the years.

What other oddball correlations are hiding in the depths of bloodwork, weight/height, etc. for patients? We've teased out all the easy ones, the ones that are left are combinations nobody thought to even measure.

Regarding the EHR debacle, I'm optimistic that something could be worked out as a standard and implemented across the board. Expensive? Sure, but it's an investment that pays off pretty quickly, I think.


We're never going to get every provider organization using the same EHR, nor would that even be desirable. But almost all of them have passed ONC Health IT certification so they have similar functionality, including exposing at least some data in open industry standard formats.


Better title, "ML on EMRs is very difficult". The article is much more reasonable than the clickbait title:

"I don’t think deep learning models on EMRs are going to be useful any time soon .. Clinical expertise is absolutely necessary to ask the right questions, to set up the inputs to the model, and to sift through the findings. Following that, clinical research will be necessary to validate the discovery"


I previously worked for a startup that did denormalization of healthcare data for the ostensible purpose of data "freedom" and interoperability, with a future focus on ML funsies. All the issues we had (besides ones we created) were around the healthcare providers fear of pissing off EPIC, Cerner, Meditech, Allscripts etc. They didn't like their EMRs---in fact they often hated them. The fact is though that there really is no viable alternative, and the data is kept behind a gate and essentially "owned" by the EMR. FHIR was supposed to solve the interoperability bit, but all the EMRs would still own the data, and their lawyers aren't keen to share.


Leaving aside the well-known dumpster fire of the healthcare system's operational incompetence, none of these are particularly novel, most of them are the author discovering basic best practices in statistics, and they don't come close to implying "DL for EMR is doomed".

A health outcome is correlated with age? Ya don't say.... This is only a problem that literally every single economist and sociologist checks for _first_. Agents in causal graphs respond to their inputs? Shocker, that's only.... The definition of an agent.

There are a ton of applications out there where a fairly naive model built by someone who took a Pytorch Coursera course can provide decent improvements. Healthcare is not one of them, and nobody serious ever thought it was. Bringing modern learning tools into healthcare is going to require a lot of smart people who know what they're doing, working cautiously to introduce these improvements into an operational quagmire with high stakes.

But this article reads like: "I tried making healthcare 'smart' in a Jupyter notebook over a weekend and it didn't work: the effort is doomed".


I've worked in healthcare IT for my entire professional career - It's A LOT more complicated than most people think. For the last 5 years I've focused on the data side of healthcare and I think that deep learning is 100% possible - it's just not achievable by a single person and it's likely VERY expensive. There are so many facets to healthcare data that's it's just impossible for a single individual to achieve something meaningful by themselves without the help of teams of doctors, data analysts, data engineers and data scientists. Just dealing with data quality issues (such as the ones called out in this essay) require a team of people to determine if metrics you are trying to measure are legit or not.

On billing: I'm convinced that the primary reason why healthcare (at least in the US) is so complex - is because of the dichotomy of saving people at all costs, while doing so fiscally responsibly. It is fairly common for large healthcare organizations to have ACTING doctors in their c-suite, who's primary goal is not to make money - it's to save lives. The people who care about saving money, reducing cost and increasing efficiency have no control over the organization. I'm not saying this is a bad thing, but IMO it's the largest contributing factor as to why healthcare billing is so complex, and healthcare costs get as high as they do (at least in the US).


https://en.wikipedia.org/wiki/Carte_Vitale

France's system is universal and private (private docs, insurance, and reference pricing -- a service has a single fee across coherently regulated payers). They have had a standardized medical record for 20+ years. One system and one medical record for everyone. WHO ranked #1 healthcare in the world (to US ~40th) at 1/2 the cost per capita of US healthcare. This is catastrophic legislative failure for a problem largely solved by lots of other people around the world.

The legislative failure has created vast administrative overhead (10, or more, staff per doc at a hospital) and corrupt insurance companies. When an insurance company has to pay a claim, they call it a "medical loss" (they had the money and they lost it). They make their money on poor service and deceit -- charging wildly different prices for the same product where they can get away with it). In France, an insurance company, by law, has to pay a claim to a practice in a few days. Imagine the decreased capital needs for running a medical practice or a hospital.

The hospitals are not blameless in all this, but the heart of it is the payer system/s.


There's actually been a pretty good report by the President's Council of Advisors on Science and Technology on this subject a while back:

https://www.healthit.gov/sites/default/files/pcast-health-it...

It did not place the blame on insurance companies. Which I think is correct. Insurance companies affect billing, but they do not drive health care records management. That's driven by an entirely different set of companies such as Epic and Meditech.

And the poor state of EHR in the USA is not entirely their fault, either. A lot of what went down is that universal standardization was not a major part of the requirements when the US government started requiring health IT. So health care providers decided to favor preserving their existing policies and procedures, which meant choosing extremely configurable EHR systems that allowed them to computerize their old paper-based systems with minimal modification. Since easy computer interoperability was never a design goal of those paper-based systems, it was not inherited by the EHR systems they grew into, either.

This creates a situation where even two Epic customers, despite being on the same EHR platform, are still hard-pressed to directly transfer records between their systems, and are as likely as not to still just fax or email each other printouts because it's easier.

And standardizing and perfecting insurance and billing would do nothing to fix this situation. The two systems are likely to be connected, but they are not the same thing.


> The legislative failure has created vast administrative overhead (10, or more, staff per doc at a hospital) and corrupt insurance companies.

Other way round: the corrupt insurance companies have created legislative failure, through entirely legal bribery and entirely legal lying to legislators and the public.

Mind you, the public are hardly blameless in this, given that detailed careful policy wonkery is boring and yelling about grand conspiracies is far more exciting, many have chosen to indulge in the latter.


> They make their money on poor service and deceit -- charging wildly different prices for the same product where they can get away with it).

You have that backward. Hospitals and doctors charge wildly different prices for the same product when they can get away with it. That's why you can... wink, wink, nudge, nudge... pay a lower price in cash. Because the hospital knows a cash price means that you are paying the bill directly, not a rich insurance company. In economic terms, hospitals have economic power to set prices.

The elephant in the room is that medical doctors in the United States earn between about 2x and 4x (depending on specialty) more than medical doctors in France or England. Yes, two to four times as much. Where, exactly, do you suppose that extra income comes from? And that's when they're honest--medical claims fraud is a major issue.

You know those "this is not a bill" explanations of benefits you receive when you go to the doctor? Well... it turns out that a fantastic lead for finding medical claims fraud is when a member phones their insurance company and asks "Hey, I don't remember this... why does it say ____?"

Removing insurance companies from the equation will not radically change the landscape. The ACA set a lower bound on the medical loss ratio at 80%. That is, insurance companies must pay out in claims at least 80% of what they charge in premiums. Yes, some of that 20% is profit, but some is also intrinsic overhead for storing medical records, validating claims, etc. Do you think AWS lets France store medical records for free?

So France pays 50% as much as the US. Let's say we take out insurance companies and pretend that you can administer claims with 0 overhead in fairy-tale land. Now USians pay 80% as much.

Where did the other 30% go?

At worst, insurance companies are a second-order effect. The first order effects are that we pay too much for doctors (because the AMA is a cartel that limits access to medical school) and too much for drugs. Just look at how forcefully doctors groups have pushed back at state attempts to expand the scope of work for PAs and NPs (whose licensing is not under the thumb of doctors, and thus can't be restricted in the same way).


Doctors have to be paid a shitload in the US because they come out of college with an average of $250k in student loan debt (for GPs, specialists are higher). Those other countries that pay doctors a half or a quarter of US salaries will generally be somewhere between "very small costs", "no costs", or "actually paying students to attend", leaning towards the latter two, so doctors don't graduate with a home-mortgage worth of debt.

However, a lot of people in the US are morally opposed to "giving someone else a free ride", you can see the current hubbub around student debt cancellation. And doctors heavily fall towards the higher-income side of the scale, so they are precisely the kinds of people that everyone points to as being undeserving of student debt relief.

It is what it is, Americans are a selfish (ahem, libertarians would say self-interested) people, but you can't make this cost go away. People may not explicitly state it, but their preferences are obvious, they would rather pay 4x the amount to a private actor than have 1x the cost in taxes, same as the rest of the problems with our health care system. People are more worried about micromanaging what everyone "deserves" than overall cost efficiency, and they vote accordingly.

It's rather sad, in a way, that Americans can't grasp that having a more highly educated, more skilled society benefits everyone in the long run. Those people go on to pay taxes, and educated people will contribute a lot more in taxes over their lifespan than they cost in education, it's a long term financial benefit, but people abhor the idea of someone getting "a free ride", despite it being the most financially vulnerable and unstable portion of people's adult lives. It's just straight-up "I chose to be a plumber, why should I have to pay for some fancy doctor's education!?".


Replying to my own post, but to the OP's point -- we can't research very well on US healthcare data in aggregate because it is so error prone and inexact to aggregate it. Data scientists should imagine what we could do with 20 years of la carte vitale data...or better of data in a regulated medical record format that was designed to be researched in aggregate. That's the second order failure here.


That is simply false and misinformation. A "loss" is just a technical term in the insurance industry generally. It doesn't mean that a medical insurer had money and then lost it. The Affordable Care Act (Obamacare) imposed a minimum 80% medical loss ratio.

On most policies the "insurance" companies aren't even providing insurance any more. They simply act as third-party administrators for self insured employers. So the insurance companies have no financial incentive to deny claims. In most cases where claim payment is delayed or denied it's because the provider organization failed to follow the rules for claim coding and attachments.


you're talking about singularized/centralized electronic medical records as if it will solve all the ills of the healthcare industry, especially the rampant corruption, but it certainly won't, and certainly isn't the reason for the differences with french healthcare. sunshine can only go so far. the root problem of healthcare, as with many american industries (e.g., education), is the lack of competition and (uncaptured) regulatory function.

and as with many gigantic, mult-faceted problems like this, band-aid solutions like centralizing medical records will do nothing (e.g., banning plastic bags) but distract us from the core hard problem. we need medical insurance, regulatory agencies, emergency services, care providers, medical training, pharma, medical devices, and a whole host of other interlocked industries to be subject to the pressures of a fair competitive environment, and we need to allow them to fail (and even punished) without that being life/career destroying (a crucial precept of our bankruptcy laws, incidentally).

spiraling costs and flagging care are a result of systemic rot, and it needs systemic solutions.


I read metehack's comment completely the opposite way: that a working records system is part of a functioning medical system and a consequence of a decision to have such a system.

In other words I interpreted the comment as making the point you make.

Having benefited from the French medical system (and having it treat my dad when he was visiting us) it's really quite good. People in the US (where I live) would laugh if they heard the way French people complain about their health care system: it's a dream by comparison. French complaints are like an American complaining that there is "too much room"


i mean, they focused on medical records as the core difference and referenced metrics to support that position.

it's true that functioning medical records could be a (small) part of an overall better and more efficient medical system, but it was positioned as the core solution that would unlock a better system overall. perhaps they had a different positioning in mind, but that wasn't the argument presented.


> They make their money on poor service and deceit -- charging wildly different prices for the same product where they can get away with it

This is the same thing medical providers and drug companies do. It’s price discrimination all the way down.

I often what would happen if medical providers (doctors, drugs, etc.) were REQUIRED to have standard rates published that didn’t fluctuate by insurance network or type of insurance plan.


Hospitals would not get paid enough and many would go broke.

Now if you coupled this with a single payer then it could work.


A rational strategy would be for America to fund ML medical research in France.


as someone who worked in this area (not ML on records, but aggregating and ingesting EHR/ELR data) the problem is not just "the standardized record" but getting providers and hospital systems to use the record in standardized ways.

HL7 and FHIR are sort of like XML or CSV - they're just formats that define fields and delimiters. You can still emit HL7 or FHIR that can't really be consumed by anything else, and there's a huge amount of work to getting it "right". One of our perennial projects was a validator tool to help onboard facilities to produce data that was actually compliant with state/CDC systems. Unsurprisingly, from memory (that wasn't really my direct fief) basically every single system (from memory it was something like 80% of attempts) failed their first couple times and needed hand-holding to get it right.

https://fhirblog.com/2014/03/28/pictorial-representation-of-...

One of my projects was integrating a clinical recommendation tool with various EHR systems. That project didn't really end up going anywhere, but even just from the sample data being used in the various sandboxes I could tell that it was gonna be a massive slog actually getting it onto client systems, because every single client system was coded differently and there was different "quirks" to the data/etc. Fixing that wasn't really my task, just dealing with it, but the point is that even if you define a system that allows these relationships/etc to be expressed, there's no guarantee that a client system is outputting well-formed, properly normalized/denormalized data. It's rough.

And unfortunately it's 1000% a XKCD "there are 14 standards and systems still can't intercommunicate, we need a 15th standard" effect. There is already so, so much work mapping around between the various editions of ICD, CPT (procedures), and usually there are instance-specific (specific to the hospital system usually) coding systems underneath that (since in many cases eg CPT does not really convey enough information about the exact specific procedure - it's enough for billing but not enough for a radiologist to actually know what scan to perform in a medical sense). And the existing coding systems are already super generic and can express basically anything in multiple ways, which just feeds into the "it's possible to emit valid records that nobody else can really consume" problem.


Most of the points are naive and confuse predictive power with interpretability (the latter is harder). For example:

> Did you know that a blood oxygen saturation of 0% is highly correlated with healthy outcomes? No, I didn’t get the percentage backwards. A 0% reading is what you get when the nurse looks at you and decides you’re too obviously healthy to bother with putting the pulse oximeter on your finger. The empty field value gets saved as a 0, of course.

Well, statistical methods (no matter if classical, Bayesian, Deep Learning, or anything, as long as it goes beyond linear methods) will perfectly capture the special case of 0% and predict accordingly. These methods are free of our biases and will take consistent approaches (e.g. empty columns, columns that de facto mean something else, common typos, etc).

Sure, interpretability is problematic, and we often need to consider the knowledge of physicians and medical institutions rather than be based on raw data.


Deep learning has great potential in medicine, in particular in radiology and tissue classification. Creating the datasets from scratch will take decades of careful deliberate highly costly effort however, and the current crap hospitals call records is utterly useless. It will truly have to be a bottom up approach, and in the process systematic studies to actually verify a lot of bullshit medical ideas will also have to be done. Basic questions like how many kinds of tissues are there have very dubious answers which are known to be coarse approximations, and some diseases are specifically deviation from the approximations. Its probably not quite as bad as linguistics where, but its really bad. Once datasets with millions of people followed and tested regularly throughout their lives for the specific purpose of generating the dataset, are available for training, it will be quite good. Shame we wont live to see it.


These systems are primarily designed to support clinical care. Research is an after thought. Health care systems will have to decide it's a top level priority.

There has been modest progress through wider efforts:

- Standard vocabularies, eg LOINC codes for different kinds of lab tests

- Mappings between vocabularies, eg OMOP

- Semantically rich vocabularies, eg OBO's OBI


Where deep learning will prevail in medical is in predictive medicine. When it finally becomes common to have personal genomic data available during checkups, the machine learning will be able to guess/order tests for individuals based on their age, specific genes they are carrying, known life age that diseases onset ect, diseases for that area. It will also be able to look across the population and detect statistical patterns in geographic incidences of diseases with environmental causes which occur outside of the normal expected distribution or in hotspots.

The important thing to remember about AI is you need reliable data to train it as well as reliable data to test it. If you don’t the FDA will not allow you to employ it medically.


That seems doubtful. Outside of a few limited, specific cases like the BRCA genes, that personalized medicine data has mostly turned out to be a disappointment and not actionable. Like if I find out that due to genetics my lifetime risk of rotator cuff injury is 25% compared to 21% for the general population then what am I supposed to do with that information?


It’s not a disappointment as we have not been able to fully try it out yet. Personalized medicine is in its embryonic infancy. Naysayers can say what they will but the medicine/therapy of the future will likely be completely personalized AI generated RNA retroviral/dendrasome delivered cocktails that are based on an AIs comprehension of your entire genome and epigenome. I’m talking the next 100 years. One major problem although beneficial at times, the FDA approval process inhibits the development and shipment of new solutions to medical disorders.


Setting aside the sensationalist headline, the entire premise of the article is flawed. It's a case of not even being wrong. Of course you're going to get spurious results using poor data.

The author's attempt at using structured EMR data is the root cause. We have found that structured data, which the author attempted to use, is at best 35% accurate. Sure it's better than claims, but it does not reach the level of quality necessary to inform clinical decision-making. The reason for this is that almost everything clinically relevant is captured in freeform text fields--clinical notes. To build proper models from information in EMRs, you have to start with processing the narrative data, which is a hard problem.

Training models to interpret clinical notes requires clinical expertise. Clinicians record facts differently in different locations, and there are many different ways to say the same things, and sometimes they skip underlying facts because some other fact implies the rest. Different specialties record things differently too. You really cannot just throw some data into a notebook and hope it works. Even with clinician input, we still find that high quality results require ensemble models with multiple techniques; plain NLP doesn't work either.

Take for example, non-alcoholic steatohepatitis (NASH), the leading cause of liver failure requiring liver transplant. NASH is a complication of non-alcoholic fatty liver disease, in which your liver has unusually large deposits of fat. NAFLD is not coded in structured data. To identify it from unstructured data, you have to extract concepts related to liver cancer, pre-diabetes, alcohol use, liver fibrosis, cirrhosis, jaundice, fatigue, and loss of appetite. To make a long story short, you cannot do these things using structured data or naive NLP approaches. F1 is zero.

So maybe his point, "Data encodes clinical expertise" is worthwhile, but the rest of the article...not so much.

Source: My company, Verantos https://verantos.com , specializes in the generation of high-validity evidence from data we abstract from EHRs using machine techniques.


"This problem is at least as hard as solving NLP" strikes me as supporting the author's claim, not refuting it.


Is part of the solution going to be having ML figure out what data might be missing to make a better conclusion, then explicitly asking the patient (or the person gathering data from the patient) for that missing data or a clarification?


The author is spot on - the current approach of applying ML to flawed and inconsistent data is doomed. However, on the bright side, I think they also highlight a possible path for data scientists to bring value to healthcare. All of the examples of relationships and correlations were spurious, but if you keep digging you will eventually unearth interesting relationships. A minority of these relationships will be useful for improving hospital administration. Examining the relationships might not necessarily improve health outcomes but there is a real chance to make hospitals more efficient using data science.


If there is no meaningful statistical information in medical records, then we should stop keeping them? By hypothesis a doctor who opens a medical record can't gain any meaningful information from it for the same reasons listed in the article. I think this is sufficient to demonstrate the article is incorrect, there is significant information in medical records, and therefore it will be possible to train a model which can reproduce missing information to some degree.


Because there is, and the person who wrote the original article has no domain experience. Fixing data sucks and it requires judgement. Public health and medical researchers derive an enormous amount of research benefit from anonymized health records. Medicare publishes a large dataset.


> The answer turns out to be rather mundane. Pap smears are not recommended for women older than 65, and heart failure onset is typically around age 65. The pap smear just turns out to be a really good age bucketing signal to the model.

Pap smears are also a really good sex bucketing signal, and there are lots of diseases that are more prevalent in one sex, so I would expect Pap smears to be correlated and anti-correlated with a lot of other diseases as well.


I'm running through a similar situation in risk management. There is so much domain and institutional knowledge that's encoded in rules that its nearly impossible to reason about them in a generic way. Add in operational that is also rife with data quality and coverage issues and it becomes quite difficult.

I think we have a term for this in both areas: garbage in, garbage out.


"Doomed to fail" is too strong IMO.

All the problems the author brings up, while legitimate, are being worked on.

For example, on the interoperability front, TEFCA is making big strides on government-mandated nationwide interoperability: https://www.healthit.gov/topic/interoperability/trusted-exch...

> In January 2022, ONC and the RCE announced the publication of the Trusted Exchange Framework and the Common Agreement (TEFCA). Entities will soon be able to apply and be designated as Qualified Health Information Networks.

Google also has made significant strides on deep learning on EHRs: https://ai.googleblog.com/2018/05/deep-learning-for-electron...


The challenge with ML and DL systems is that it's difficult to know a-priori what will and won't work. The math would indicate that there is nothing a suitable DL system cannot learn, however in practice certain neural architectures can only learn certain inputs. The cycle time to develop a new system is long, and data is unfortunately scarce. Developing a DL system to solve a problem involves guesswork as to the impact of innovating on any one of dozens of components.

Which is to say, it is and will be difficult to create a business based on applying a novel DL method to a particular problem space. We're seeing a consistent trend that focusing on the other aspects of the problem such as data, tooling, or end to end services tends to be much more successful.


I don't really want to comment on whether or not DL is doomed to fail on the EMR, but coming from an EMR background, I can say he lays out very accurate points. I particularly like how he explains #3 concisely, and it's a point I use to criticize the private healthcare system.

The continual war between hospitals having to opportunistically charge for their services vs. the insurance industry having to take a default stance of deflection creates the massive, meaty layer of coding and billing waste. Thousands upon thousands of jobs exist just for this purpose, and I think any inefficiency in a single-payer system is more than offset by getting rid of that layer and everyone benefits.


I agree with all points with respect to static EMRs including NLP efforts.

The cadence, uniformity and alignment of event stores to underlying pathways does pose an opportunity (but there's so little research and I suspect Brian didn't have access to this space). An EMR is a projection/point in time snapsot of these events (basically API calls). There's also inherent natural labels because inferences are evaluated against an actual pathway.

Specifically speaking about intra-episodic inferences -- longitudinal predictions (over several episodes -- or a patient life time) becomes wildly inaccurate. There's a lot of research to demonstrate this no matter which models are used.


We need either A) competent, highly technologically sophisticated government (which seems very far away obviously) or B) something really similar to take it's place.

So much of what government does (aside from the bombings etc.) is really about providing and enforcing a framework for people to work together. And in this era that needs to be a high tech framework.

Actually, it needs to not only be very high tech, but also very cutting edge, decentralized, sufficiently holistic but also flexible enough to evolve.

Which is incredibly hard, and we probably will not get due to greed, stupidity and politics, and that may be the actual reason that human civilization is superceded by AI civilization.


There is something I've been developing over the years which I call "Konerding's Empirical Observation #7": every attempt to improve health care with technology in the US will only every increase the cost while also decreasing the quality of service (on average).

This goes hand in hand with Konerding's 3rd empirical observation, which is that there is a never ending supply of "machine learning geniuses" who are naive about how health care works in the US, and spend some 20 years learning how hard it is to actually do anything actionable with health care data.


I'm an American physician. I've practiced in a couple of specialties and know the insurance billing drill (or should I say game) pretty well.

Medical record systems, as other comments point out, have been constructed for the benefit of administrators, recording clinical data is mainly to support administrative needs.

In reality manifestations of a given illness vary continuously across a wide spectrum. It means patients with the same diagnosis have differing sets of symptoms and course of illness. As I like to say it, "no two patients have exactly the same disease."

Official disease classification schemes embody the "splitter" model which attempts to fit continuous data into discrete categories. Of course sharp-edged distinctions suit purposes like billing and other management operations.

However diagnosis is often ambiguous, mixed or multiple, but no matter what categorical diagnoses must be assigned. Unsurprisingly doctors are biased to select the choices with the greatest reimbursement, and "stretching" criteria to cover the patient's condition (or vice versa) is not at all uncommon. (Also, there's not assigning diagnoses where it might negatively impact payment.)

EHR clinical data follows the discrete assumptions of the system design. This can create an impedance mismatch between clinician observations and data input. This may be troublesome, for example, in specialties (behavioral health) where data is complex with many overlapping subtle but meaningful variations.

I've often received records after a patient has been hospitalized. More than not it's difficult to understand the course of treatment as there's no coherent summary or narrative description provided. The "pile of data" literally transcribed from the EHR isn't very useful to human readers. To be sure information like lab reports, etc., are good to have, but the marginalized human-to-human element is troublesome.

EHR clinical data failing to serve ML purposes could only point to problematic EHR system design. Though I've been aware of EHR limitations, I wouldn't have guessed about ML issues. The article taught me something about the problem that exists. Now it remains to be seen what can or will be done about it.


It strikes me that the decades of academic literature that establish "in order to get a result correlating X and Y, we needed to control for A, B, C, ..." would be a critical input into any system attempting to work with medical data. In a way, the plaintext of historical medical journals has encoded much of this expert knowledge, albeit with retractions and errors the further back you go. But that might change the conclusion of the OP into more of an "incredibly hard problem" rather than "doomed to fail."


You see these same pitfalls in much of the medical device industry. Often hospitals and their weird political internal workings drive reasons for things, and not quality or efficiency of care, care for their workers, etc. You see this present itself af far up as to help choose a certain technology development path for an entire industry based on these non-real internal hospital dynamics and intrenched ways of doing things, and billing for them rather than working to improve quality of care or outcomes.


Spurious correlations are common in medical models not idiosyncratic like in other fields.

Which makes it hard to use models in clinics.

Doctors are smart about this. Every providier we spoke to about survival models based on biomarker data did not want supervised learning models. The explainability, trust, authorization just aren't there and the risk of misuse is too high.

For now, research is a better use case, but less commercial funding.

It is not hopeless, but your basic LSTM is probably not going to revolutionize medicine.


I've always taken part of my job as a data scientist is to articulate when data is not sufficient to meet a particular a goal (and to outline what data would be necessary). For some goals doing special data sampling and building a model on small (complete) data is better than using the fragmented overall database.

I work mostly on claims data now, and building models to identify problematic claims I assure folks is feasible.


What if I do need to go to a hospital, but don't want any information about me to be shared? Is that impossible? What are the ethical considerations for me, when I disagree entirely to any data collection? Must I be coerced against my will into providing to Google or whoever regardless?

The point I am raising is that there is an embedded assumption that this info should even be available for deep learning to work over.


For transmissible/infectious diseases, this is certainly impossible, since states ingest case reports and lab data surrounding this and send digests to CDC. The term of interest would be "NEDSS reporting".

https://www.cdc.gov/nndss/about/nedss.html


Pay cash maybe and even then it might be in the terms of service of the clinic or hospital that they can use or share your data as needed. It may be deidentified as well. An alternative is to check in under an assumed name which many famous people do.

There are concierge doctors as well that probably have stricter guidelines.


You may not realize that EMRs owe their existence to

    1.billing
   2. government mandates
   3. billing
    4.helping doctors keep track of their patients’ records
just like how people think ADP is in the business of payroll. Theyare actually in the business off mitigating regulation, taxes and liability. Getting your wage to you is at best 2nd priority to all those.


A rule encompassing "pap smear" and "over 65" data fields is encoded right here in the OP comment. Aren't these kinds of rules and relationships what "deep learning" is supposed to suss out, automatically and without intervention?

If not, I wouldn't call it "deep".


If anyone is interested in working with emr data, my lab is looking for phd students, postdocs. We have quality anonymized emr and dicom data and research protocols to work with along with domain expertise and deep learning infrastructure.


This seems like a lot of very "USA" problems.

A lot of countries don't have the same drivers.


We need to work on the hard problem of building causal models of the world, then we can build causal models of medicine on top, then we can do learning on medical records.


But not before a lot of companies make tons of money by promising deep learning will cut healthcare costs by percentage points!


This is very specific to the US medical system.

Perhaps things are better in some of the many other health care systems around the world.


Maybe. Almost everyone follows the same standard (some version of ICD, https://www.cdc.gov/nchs/icd/icd10.htm) of coding. If it's much better outside the US I'd be surprised, though. The standard is a mess of strange, super-specific things (e.g. ICD10 code W56.52XA is "struck by other fish, initial encounter") that aren't useful for any purpose, and in practice is used mainly for billing (insurance or otherwise).


I'm fond of W59.22XA, "struck by turtle - initial encounter" aka the Aeschylus ICD code


Y385X3A Terrorism involving nuclear weapons, terrorist injured, initial encounter


You know, I have never really given EMR’s in other countries much thought and I’m now super curious how implementation has gone, what some good examples are, etc.


Epic expanded into the Nordics and their providers found it disastrous on the whole, mostly because it's designed for the US healthcare system and therefore cares almost exclusively about billing, which is completely irrelevant for them. See https://www.politico.com/story/2019/06/06/epic-denmark-healt...


These sort of headlines are just cheap click bait.


It's kind of funny that the author mentioned about data fragmentation in the very first section and at the end went to provide Heart rate/ECG/Afib as one of the successful case studies for deep learning for EMR.

ECG is the classic case of data fragmentation that he's talking about and if you want to export raw data from the popular ECG machine like Mortara, good luck with that. Heck even Apple currently does not support exporting raw data ECG out of the box that's crucial for deep learning, apart from the PDF image file for the ECG waveform [1]. Literally there are more than a few dozens open formats for the ECG [2],[3] and this obligatory XKCD comic comes to mind except that there are 39 standards instead of 15! [4]. Even if the ECG machine manufacturer is using one of the formats there are still serious interoperability issues down the road.

Rambling aside, there's yearly cardiology global challenge competition organized by Computing and Cardiology Conference (CINC) and recently there are machine learning and deep learning techniques proposed for multi-lead ECG diagnostics but the results are not that great [5]. Hopefully it will start an impetus to deep learning in EMR similar to ImageNet Challenge that gave rise to the actual deep learning algorithm that drived the community past the winter AI.

[1] Accessing the ECG Data of the Apple Watch and Accomplishing Interoperability Through FHIR:

https://pubmed.ncbi.nlm.nih.gov/34042901/

[2] A review of ECG storage formats:

https://pubmed.ncbi.nlm.nih.gov/21775198/

[3] A Review on Digital ECG Formats and the Relationships Between Them:

http://diec.unizar.es/~imr/personal/docs/paper12IEEETITB1.pd...

[4] How standards proliferate:

https://xkcd.com/927/

[5]The PhysioNet/CinC Challenge:

https://cinc.org/physionet-cinc-challenge-awards/




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: