ML model can classify sex from retinal photograph, clinicians can't

drcode · on May 15, 2021

Keep in mind ML models are really great at cheating to get answers: Maybe it's detecting that women have to tilt their head up more to reach the retinal photography machine, because they're shorter on average. Maybe, some of the images come from an optometrist that specializes in women's glass frames, and his retinal photography machine has a slightly dimmer bulb. Maybe, men are more likely to get a retinograph only when they have more severe disease already, so the retinas look different for that reason.

acituan · on May 15, 2021

Their use of an external validation dataset eliminates many, if not all, of those concerns.

Regarding external validation set:

> This dataset differed from the UK Biobank development set with respect to both fundus camera used, and in sourcing from a pathology-rich population at a tertiary ophthalmic referral center.

Regarding UK Biobank set (training set)

> UK Biobank dataset, which is an observational study in the United Kingdom that began in 2006 and has recruited over 500,000 participants—85,262 of which received eye imaging38. Eye imaging was obtained at 6 centers in the UK and comprises over 10 terabytes of data39. Participants volunteered to provide data including other medical imaging, laboratory results, and detailed subjective questionnaires.

nerdponx · on May 15, 2021

Still, until we have a better sense of what features the model is extracting, it's a surprising result and ought to be treated with caution.

slver · on May 16, 2021

I wish we treated our own conclusions this way.

ithkuil · on May 16, 2021

Science's secret weapon is the innate pleasure that many academics have in proving their colleagues wrong.

slver · on May 16, 2021

Unfortunately that also results in them trying to prove their colleagues are wrong about other of their colleagues being wrong.

So the bottom line is that ironically due to this science gets stuck on false beliefs and they can be a bit hard to dislodge.

We have even this principle: https://en.wikipedia.org/wiki/Planck%27s_principle

raihansaputra · on May 16, 2021

That same spirit is also what's behind shorts in the market to keep overly optimistic players in check. As with crypto-ransom operations against corporations to keep their system security on alert. Adversarial actions (within reasonable boundaries) is the most important balance mechanism.

slver · on May 17, 2021

The problem is keeping the adversaries balanced. Otherwise you have one side overpowering everything (and typically not through the "correctness" of their argument, but scale and resources) and the adversarial system becomes detrimental to progress.

X6S1x6Okd1st · on May 15, 2021

Yup. I learned this the hard way on the last model I trained at scale. It was evaluating fit between two heterogeneous classes I sampled training & test split off of a large time window and got to work. It performed extremely well on test & training. Too good.

I pulled a third sample from a completely different time window and it performed terribly.

It turned out that both datasets were dominated by class A being sorted into always selecting great fit or poor fit, so the ML model learned to memorize the class A instances.

This problem when away when I subselected down to only instances of class A that had examples of both good fit and poor fit.

max_ · on May 15, 2021

Thank you very much for this comment.

My problem with modern scientific publications is that they focus more on the "discovery" instead of describing the logical rigor as to why their "discovery" could be true or false.

_nothing · on May 16, 2021

What kind of publications are you referring to? Yeah, many pop-sci articles often overstate the implications of a given discovery, but actual research papers (such as the one linked) typically have a discussion section in which they describe the limitations of the discovery and what kind of future research could help expound on the pitfalls of the current research.

The paper in this post itself has its own Limitations section.

max_ · on May 16, 2021

> The paper in this post itself has its own Limitations section.

I haven't actually read the paper. I was reacting to the title.

> what kind of future research could help expound on the pitfalls of the current research.

I think a better way of documenting research to people is by describing what scientific boxes where checked.

In the given case for example one of the boxes maybe,

"Is there an anatomical distinction between retinas of sexes?" — If that is true then we can see if machine learning can detect such differences.

Take computer science for example. A publication with the title "Professors create a machine that can think" would maybe published as "Professors create machine that passes Turing test [0]"

Another example, can be that is medicine.

The research in question maybe if a microbe A causes a flue.

Instead of a publication being "Microbe A causes disease B".

A better Publication IMO should revolve around, "Microbe A had passed Koch's postulates [1] for disease B"

[0]:https://en.m.wikipedia.org/wiki/Turing_test

[1]:https://en.wikipedia.org/wiki/Koch's_postulates

Baeocystin · on May 15, 2021

To wit- the apocryphal Tank story:

https://www.gwern.net/Tanks

B1FF_PSUVM · on May 15, 2021

"""

Once upon a time—I’ve seen this story in several versions and several places, sometimes cited as fact, but I’ve never tracked down an original source—once upon a time, I say, the US Army wanted to use neural networks to automatically detect camouflaged enemy tanks.

The researchers trained a neural net on 50 photos of camouflaged tanks amid trees, and 50 photos of trees without tanks. Using standard techniques for supervised learning, the researchers trained the neural network to a weighting that correctly loaded the training set—output “yes” for the 50 photos of camouflaged tanks, and output “no” for the 50 photos of forest.

Now this did not prove, or even imply, that new examples would be classified correctly. The neural network might have “learned” 100 special cases that wouldn’t generalize to new problems. Not, “camouflaged tanks versus forest”, but just, “photo-1 positive, photo-2 negative, photo-3 negative, photo-4 positive…” But wisely, the researchers had originally taken 200 photos, 100 photos of tanks and 100 photos of trees, and had used only half in the training set. The researchers ran the neural network on the remaining 100 photos, and without further training the neural network classified all remaining photos correctly. Success confirmed!

The researchers handed the finished work to the Pentagon, which soon handed it back, complaining that in their own tests the neural network did no better than chance at discriminating photos. It turned out that in the researchers’ data set, photos of camouflaged tanks had been taken on cloudy days, while photos of plain forest had been taken on sunny days. The neural network had learned to distinguish cloudy days from sunny days, instead of distinguishing camouflaged tanks from empty forest.

"""

SubiculumCode · on May 15, 2021

That is a pretty novice mistake for a paid scientist, imo.

m4x · on May 16, 2021

The story apparently dates from 1963. I wouldn't judge any scientist training neural networks in the 60s too harshly.

SubiculumCode · on May 16, 2021

I certainly did not expect the down votes. I've designed a number psychology experiments and I and my colleagues were always careful to balance experimental stimuli and conditions on a number of factors, ranging from luminosity to semantic association, and especially order.

When I was tasked to create stimuli or to take measurements on a series of items it was always important to try to eliminate systematic differences on non-interest, most typically thru randomization of the order of creation or measurements.

gwern · on May 17, 2021

I would point out that the study which appears to be the closest thing to a tiny seed of truth to the tank story (Kanal & Randall 1964) actually did balance its photographs by construction, by subcropping large aerial photographs to have tank/no-tank sections and training on that. That necessarily controls for time-of-day, weather, luminosity etc.

sgt101 · on May 15, 2021

apocryphal ehh?

m463 · on May 16, 2021

I read somewhere that an ML model could detect cancerous tumors, but it was keying off the ruler used to measure tumor size.

stevelosh · on May 16, 2021

Paper: https://www.sciencedirect.com/science/article/pii/S0022202X1...

> For instance, in our work, we noted that the algorithm appeared more likely to interpret images with rulers as malignant. Why? In our dataset, images with rulers were more likely to be malignant; thus the algorithm inadvertently “learned” that rulers are malignant.

raverbashing · on May 16, 2021

Yeah, when I read the headline I thought Cornea/Iris, not Retina, and there are a couple of ways an ML could "cheat" there.

But still, there are ways of getting the resulting NN and applying some techniques to figure out which variable/patterns are responsible for most of the output.

ASalazarMX · on May 17, 2021

ML needs an EXPLAIN command so it can point the features that determined its answer.

mousepilot · on May 17, 2021

with neural networks, I'd imagine this feature would be difficult to implement.

ASalazarMX · on May 19, 2021

NN: This is a woman's retina.

Scientist: How do you know that?

NN: I just do, trust me.

ed25519FUUU · on May 15, 2021

Reproducibility is one thing I like about AI research. If they provide the model, I can take on my own computer and test it against whatever I want and judge it.

Most things in science is almost impossible to reproduce because of cost or specialized equipment.

sombremesa · on May 16, 2021

> test it against whatever I want and judge it

Is that what reproducibility means?

robocat · on May 15, 2021

Slight reflections of eyelashes? Especially if blurred?

8note · on May 15, 2021

I'm uncertain how they're defining sex in this.

I think a good control would be to see how this model treats trans people at varying stages of transition

read_if_gay_ · on May 15, 2021

I don’t think its confidence will necessarily match the degree to which a trans person has transitioned.

But indeed it would be interesting to see how trans people are treated at all.

Spooky23 · on May 16, 2021

It depends on what you mean, as that community has a pretty vast and complex taxonomy.

Estrogen levels affect the eyes, so given enough samples and context it may be surprising accurate for people getting that type of therapy.

AussieWog93 · on May 16, 2021

I think the problem with that line of thinking is that even after fully transitioning, trans people still tend to have strong biological markers of the sex they've transitioned from.

For a ML model to correctly guess gender identity, it'd need other cues that indicate a person would prefer to be referred to by a certain pronoun, such as clothing or facial hair.

elil17 · on May 16, 2021

It really depends on the bio marker. Most bio markers are effected: fat distribution, hair texture (curly vs. straight), and even immune system function. I’d think that learning whether this model works on people who’ve taken hormones for gender transition could give some insight into what it’s detecting. Perhaps there are retinal features whose function or form are impacted by the presence of testosterone, for instance, or maybe it’s a different that forms before birth.

0x000000E2 · on May 16, 2021

My theory: retinal photos often contain parts of eyelashes.

Women's are statistically longer.

Their eyes are also more almond shaped.

Retinal photographs are hard to take. Often they contain significant amount of eyelashes and surrounding structure

ncmncm · on May 16, 2021

Serious question: Do you know what a retina is?

antonvs · on May 19, 2021

I'm not sure he knows what eyes are.

Traster · on May 15, 2021

>Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task.

If I'm reading this correctly what they're saying is that since we don't currently know the difference between male and female retinas, being able to explain what the ML black box is doing is important. But from what I can see in the paper they basically don't know what the black box is doing, they really don't understand what features their tool has isolated. I might be misunderstanding though?

thaumasiotes · on May 15, 2021

I don't really understand your confusion?

They say the following:

- This model can distinguish photographs of a male retina from photographs of a female retina.

- We don't know, ourselves, how to do that.

- We would like to be able to determine, from looking at the model, what features it's using to draw the distinction.

What's weird?

mattkrause · on May 15, 2021

> We don't know, ourselves, how to do that.

Have we actually tried?

There is a cursory discussion about how this is "inconceivable to those who spent their careers looking at retinas". However, if it's not clinically useful (as the next sentence says), those experts probably haven't spent much--if any--training themselves to try.

Humans can learn to detect surprisingly subtle features. For example, the right training regime can make you much better at reporting the tilt of a line, but it requires practice and feedback, just like the network got.

barry-cotter · on May 16, 2021

I bet this would end up like chicken sexers. Near 100% reliability, no ability to explain how they do it.

thaumasiotes · on May 16, 2021

>> We don't know, ourselves, how to do that.

> Have we actually tried?

Doesn't matter; most of the stuff we don't know how to do is stuff we've never bothered trying. That doesn't mean we secretly do know how to do it.

mattkrause · on May 16, 2021

The title claims that humans "can't" do this.

For that to be true, it seems important to me that they've actually tried and failed, versus having never bothered.

thaumasiotes · on May 16, 2021

Not at all. The usage of can't is fully within the norm. For example, I cannot tango, despite the fact that, with training, I could.

spoonjim · on May 16, 2021

How do you learn to get good at the tilt of a line?

mattkrause · on May 16, 2021

Practice, practice, practice.

The umbrella term for this is "perceptual learning" and it turns out that training can tweak the visual system: baseball players learn to read where an incoming pitch will go, radiologists can find subtle clues that indicate a tumor, and--as someone pointed out above--chicken sexers can tell the sex of a baby chicken somehow. These are fairly "high-level" phenomenon. Surprisingly, training also works on low-level visual phenomena, which you might think are limited by the eye itself or 'hard-wired' neural circuits. It's mostly just practice.

One of the classic experiments looks at vernier (hyper-)acuity. You're shown pairs of lines that are slightly offset, like:

    |      or   |
      |        |

You report whether the top is shifted rightwards or leftwards. The computer gives you feedback, and shows you another pair (varying the distance to make it easier or harder). If you keep doing this, your threshold (i.e., the smallest offset you can reliably report) will about decrease 5-fold. The same thing happens for many other phenomena too--reporting the direction that some dots are moving, the tilt of a line, etc. In some cases, the improvements continue for days or even weeks of training.

The wild thing is that they're often very specific to the training. For example, if you trained with the vertical stimuli above, the improvement does not transfer over to stimuli like:

     _    or    _
      _       _

There are "tricks" to designing a curriculum that generalizes and is relatively efficient, but there's still a lot to learn.

Traster · on May 15, 2021

What I'm confused by is that they say this is important to do, but then don't actually seem to do it?

joe_the_user · on May 15, 2021

They don't give an explanation because they don't know how to give an explanation - many if not most ML model lack an easy explanation presently, they just spit out answers.

They are saying "someone should do this because it's important even though we don't (presently) know how to do this".

thegginthesky · on May 15, 2021

Wouldn't model explainers, such as Shap[0] or LIME[1], be able to assist in explaining these results?

There are some very interesting advances in the space of model explainability that merit at least a try.

[0]https://github.com/slundberg/shap

[1]https://github.com/marcotcr/lime

Mehdi2277 · on May 16, 2021

Have you ever used them? They're pretty but not things I put much trust in and the other awkward issue is there's been a variety of papers that do an analysis of many interpretability algorithms and find some very weak properties. One example is train a model M, run something like SHAP, delete the features marked important in the original dataset, train a new model M2 and see minimal performance difference. At best that tells us many models have large number of ways to get there answer and the interpretability algorithm is giving you one. If your goal was understanding what distinguishes the data and not just specific model you failed. At worst it tells you the local way most interpretability algorithms work is just not revealing. The local nature (most are heavily gradient based) when working with data like images where individual pixels mean little also puts a lot of skepticism for me. There are a decent number of interpretability algorithms that look like they're just running edge detection which makes sense with gradient part but is poor for interpretability.

aaron-santos · on May 15, 2021

I'm open to learning why class activation maps (CAM) would or wouldn't be a good place to start.

make3 · on May 15, 2021

future work is almost always discussed in publications

JabavuAdams · on May 15, 2021

Important to do next.

meithecatte · on May 15, 2021

Yes, they are highlighting the importance of research towards model explainability.

slver · on May 15, 2021

I believe that’s impossible.

SequoiaHope · on May 15, 2021

ML explainability is a wide field with a lot of success. For example you can discover what features are activating which detections. You comment, taken literally, suggests that research in this field is impossible. That is not the case.

sigstoat · on May 15, 2021

when the activated feature in an image recognition net looks like a lovecraftian horror, that doesn’t explain how the net came up with “turtle”.

explainability is going to have a rough time for the same reason ai alignment is going to have a rough time. people think they can explain decisions (technical and moral) far more effectively than they actually can.

SequoiaHope · on May 16, 2021

Yes it is a difficult field that will not solve every problem with ML, but I wouldn’t say that performing research in the field is impossible.

slver · on May 16, 2021

I judge by our general inability to explain most of our intuitive decisions (intuition is basically trained pattern recognition neural network without reflection, much like artificial ML).

A great sports player often makes a lousy coach. They often can't articulate how they play, how they move, and how they think, except in the broadest strokes.

All our formal/reflective models are evolved entirely independently of our intuition. They're supported by our intuition (release ball, ball drops, gravity) as a heuristic, but not explained by it. They're independent in terms of logical frame.

I suspect a similar thing will occur in ML. We'll have our black box ML that produces great results which we can't explain. And we'll have other NN models that arrive at reflective models, which however are far less insightful on their own (at least per watt consumed, let's say).

The trouble is that we expect to sit down and debug a series of equations in a NN and come up with a "factual nugget" or a few, that explains what happened. Actually correlation often doesn't work this way. You just correlate dozens of mundane factors that vary by one degree between outcomes, and you happen to be able to produce a solid result from that.

Expecting specific models in a NN is a bit like inspecting a dog photo on your phone under a magnifying glass, expecting to learn more about the dog. Instead you cease seeing a dog, and start seeing arbitrary colored pixels.

IQunder130 · on May 16, 2021

The problem is with your assumption that the decision function can be reduced to a nice, clean, closed form that a human's brain can consciously conceptualize. It might as well be a tangled mess of a thousand parameters you can't reduce any further without losing predictive power. There is no particular reason why anything should be simple.

hervature · on May 15, 2021

We're not at the point of explaining complicated models with a straight face. Based on the saliency maps, it looks like the model has learned something around the bright circles (or is it the blind spot? Not an ophthalmologist). Makes me think the network can reverse engineer distortions in the light to get curvature of the lens which might be indicative of gender differences.

drdeca · on May 16, 2021

The circuits thread at distil.pub seems promising to me (though I’m not a ML researcher, so others would be better able to judge this)

That’s not to say it is quite to the point of completely explaining each classification, but, I do think it is very promising.

oogabooga123 · on May 15, 2021

Your question is confusing, it might be that you are using “feature” in an ML sense and the quote refers to human describable distinctions we know about? But I still don’t know how to parse your question.

The model can predict male vs female retinas but they don’t understand why. What exactly are you asking?

SequoiaHope · on May 15, 2021

ML models have layers, and neurons in a given layer detect “features” in the image or in the previous layer. So yes I believe the person meant which structures in the image are activating the network. Which is a well studied area so it is surprising the authors didn’t explore that.

caslon · on May 16, 2021

This isn't necessarily surprising. We've known that hormones have a particularly strong effect on the eyes for a long time.

https://www.aao.org/eye-health/tips-prevention/how-hormones-...

Thanks to hormones, women may experience vision changes throughout their adult lives. The hormones estrogen and progesterone have a lot to do with this. Their changing levels can affect the eye’s oil glands, which can lead to dryness. Estrogen can also make the cornea less stiff with more elasticity, which can affect how light travels into the eye. The dryness and the change in refraction can cause blurry vision and can also make wearing contact lenses difficult.

m3at · on May 16, 2021

This was surprising to me, I did not know. Thank you for sharing!

phnofive · on May 15, 2021

The link points to some sort of viewer which lagged badly for me - here’s the PDF:

https://www.nature.com/articles/s41598-021-89743-x.pdf

Scoundreller · on May 15, 2021

I remember a researcher doing some early research on compressing diagnostic imaging and was happy about all the hard disk space saved. They did some research to find out what level of compression they could go with that wouldn't result in different clinicians reaching different conclusions from the same images.

It really upset me. We probably threw away decades of training data that a computer could have used for early detection.

Fine for broken arms of whatever, but for cancer diagnostics, ugh. The computer might have been able to see the tumour before a clinician.

SubiculumCode · on May 15, 2021

It is easy to forget, but disk space used to be a super precious resource for labs.

grogenaut · on May 16, 2021

That data compression meant getting imagery to experts in less than an hour digitally instead of via sneakernet physically in a multi-day timeframe, thus massively speeding up and improving access to high quality diagnosis.

Scoundreller · on May 16, 2021

Could do both, but still depends on storage costs. But I guess re-scanning wouldn't get justified once there's already a "good enough" compressed version.

Real question is if they dialed down the compression as storage got cheaper. Doubtful.

kevingadd · on May 15, 2021

I would be curious whether this is actually classifying something that typically corresponds with sex, like hormone levels. In that case people with hormonal disorders would potentially be mis-classified, and someone photographed pre/during puberty might also be mis-classified. Since the paper mentions both neural and vascular tissue being represented in retinal photos, it seems like the levels of various hormones in the individual's blood could potentially also generate a mis-classification if they (for example) cause blood vessels in the eye to expand or contract. The mention that foveal pathology causes the model to mispredict suggests it would probably have issues in these cases too, I think.

I wonder what actual values they were trying to predict with this analysis? Based on the paper, I get the impression they were trying to do something more interesting and they got the best data for sex.

spuz · on May 15, 2021

They were specifically trying to classify sex because it is something that experts cannot already do:

> While our deep learning model was specifcally designed for the task of sex prediction, we emphasize that this task has no inherent clinical utility. Instead, we aimed to demonstrate that AutoML could classify these images independent of salient retinal features being known to domain experts, that is, retina specialists cannot readily perform this task.

cassonmars · on May 15, 2021

I wonder this too, and they could get a pretty solid answer if they incorporated transgender folks’ data into the set since they’re actively keeping their hormones in the desired ranges of their gender identity.

esyir · on May 15, 2021

Generally you'd start with the common, easier problem before you delve into the abnormal cases. There are probably better ways to do that, like looking at bloodwork.

lukeschlather · on May 15, 2021

That's fine for training, but for testing the model delving into abnormal cases seems required. If there's noticeable difference on abnormal cases that gives a lot of insight into what your model is actually testing.

cassonmars · on May 15, 2021

All observations about the use of the term “abnormal” aside, I agree the idea of specifically isolating the hormone variable by using groups of people who naturally fit that range and groups of people that use medication to mirror that range would at least indicate whether this classifier is picking up on sexual dimorphism or vascular effects from hormonal differences (which also would potentially impact not only transgender people, but intersex people — who make up around 3% of the population)

riahi · on May 16, 2021

The lab I was affiliated with published a similar paper on gender dimorphism for pediatric hand and wrist radiographs in 2018. The experience of working on that paper made me realize saliency/class-activation maps are only really useful if you want to know where an object is in the image, but not that helpful if you are trying to legitimately try and figure out what the AI is using to classify.

I spent days looking at male and female hand radiographs + class activation maps trying to figure out what the system was using to tell the genders apart. I never figured it out.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6646498/

almog · on May 15, 2021

Just throwing a guess here about one factor that partitions photos based on sex might: height (I'm assuming males are generally taller).

Looking at how a retinal photography machine looks like, I'd guess the height at which the photo is taken might slightly affect the POV angle, which in turn might be just enough to get caught by the ML model.

astrange · on May 16, 2021

Do people usually stand up to take retina photos?

almog · on May 16, 2021

They sit down, as is my understanding, yet from photos I found on google search, it seems like people bend their neck forward to adjust their head's height depending on the height of the camera. Your head height from the ground, even when seated, depends on your torso's length (and the chair's height, which I assume people don't adjust to their preferences as patients in a clinic).

hellbannedguy · on May 15, 2021

I had a Ph.D instructor in psychology.

She said, women are actually taller than men when all cultures are studied.

I don't care enough to dig deeper, but always stuck with me.

username90 · on May 15, 2021

People with Ph.D's often makes up facts and believes in nonsense just like everybody else.

Edit: This includes those with STEM degrees as well. You really shouldn't trust someone more just because they have a research degree. I knew a professor who claimed to have solved some famous problems but that the peer reviewers just didn't want to accept that he solved it and therefore rejected his papers.

make3 · on May 15, 2021

like you did just now. That person made something up. Let's not put all Ph.Ds in a bag just for that.

Dylan16807 · on May 16, 2021

> like you did just now.

Did what? Made something up? What specifically?

make3 · on May 16, 2021

make something up

Dylan16807 · on May 16, 2021

What do you think they made up? You need to be more specific than "something", because I don't see anything in that post that looks like they just made it up.

In particular, "People with Ph.D's often makes up facts and believes in nonsense just like everybody else." is definitely not something they just made up.

And "just like everybody else" is not putting a particular group into a bag.

ungamedplayer · on May 16, 2021

It's putting them in the same bag as everyone else.

But is being put in a bag such a bad thing. Phd ppl are just as shit as the rest of us.

username90 · on May 16, 2021

To clarify, I trust research papers since they are a network of citations each reviewed by peers, not because they were written by researchers.

So when someone cites papers they tend to be trustworthy and you can have a discussion. When they don't you can't trust them, you can still have casual conversations but you can't trust any facts. Them being a researcher or not doesn't matter here.

In short: Trust science, not scientists.

jan_Inkepa · on May 15, 2021

That's an unexpected (to me at least) claim to make - https://onlinelibrary.wiley.com/doi/abs/10.1002/ajpa.1330530... - I always thought of sexual dimoprhism (men taller than women on average) as a given and this paper backs up my claim specifically in the context of human societies (where it gives a 10cm height advantage in average in men over women, comparing 216 different societies ). There might be some way of counting in which the converse is true, but not in any way I know. I understand you don't want to dig deeper, but thought I'd flag it in case anyone else unwittingly digests this (possibly wrong) knowldge.

https://www.quantamagazine.org/males-are-the-taller-sex-estr... is an example of a pop-sci article assuming the same basis (written by a biology phd).

Erik816 · on May 15, 2021

She appears to have made that up: https://www.worlddata.info/average-bodyheight.php

p1esk · on May 15, 2021

How long did they train humans to perform this task?

NoPie · on May 15, 2021

Do we even need clinicians be able to distinguish male and female retinas?

The task is meaningless. Yes, there might be some interesting facts in discovery how male and female retinas are different. It could even lead to differentiated treatments. But ML hasn't provided any clues regarding this and therefore it is not that deep.

p1esk · on May 15, 2021

It's not about differentiating male and female retinas (that's just a PoC to publish the paper), it's about using ML to find something useful in data (e.g. signs of a disease) which might be hard to see for humans otherwise.

dkarras · on May 15, 2021

I don't think we even know what to look for yet.

>Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task.

starchild_3001 · on May 16, 2021

I remember a debate a couple of years ago where some woke scientists were arguing men and women brains are indistinguishable. Others were saying there are distinguishing features. This kind of analysis may put an to that debate.

Edit: This has been already done. 93% accuracy reported. Though this number may need to be verified. https://www.frontiersin.org/articles/10.3389/fnins.2019.0018...

wizzwizz4 · on May 16, 2021

Probably measures some skull shape features; lots of other bones are so distinguishable.

wizzwizz4 · on May 15, 2021

Odds on it detecting mascara or eyelashes or some other makeup? Retinal photos have to go through the front of the eye, after all.

terramex · on May 15, 2021

If there ever was mascara on your retina you better be going to the hospital quickly.

YeGoblynQueenne · on May 15, 2021

That's a bit of an exaggeration. Mascara in the eye happens very often, to me anyway. What's really painful is the hairs off the mascara brush detaching and insinuating themselves under the eylid. That. hurts.

capableweb · on May 16, 2021

As this is HN, the previous comment is to be take literally and pedantically, as is tradition.

They are talking about mascara on the retina (inside your eye), not just mascara on the outside.

Mascara on your retina would most likely end up in a hospital visit. The first question? How the fuck you manage to get mascara on your retina?!

hermitdev · on May 16, 2021

> Mascara on your retina would most likely end up in a hospital visit.

... And likely blindness. The retina is the inside of the back of your eye, where the optic nerve attaches. If you have mascara on your retina, it means you have punctured your eye.

Having something between your cornea and eye lid, while extremely irritating and painful - especially if a scratch results, is far, far less painful than having something puncture your cornea. Trust me, I've had surgery behind the cornea and had the anesthesia wear off.

YeGoblynQueenne · on May 16, 2021

>> As this is HN, the previous comment is to be take literally and pedantically, as is tradition.

My so bad. Thank you for correcting me. As is, also, tradition.

Blikkentrekker · on May 15, 2021

It's entirely possible that it reduces the amount of light that enters the retina on averaqe, thus affecting the picture.

Seattle3503 · on May 16, 2021

In figure 1 it looks like we have 50% precision at 100% recall. I don't do a lot of binary classification, but should it be concerning for the model if it is only right half the time (random chance) when we demand it predicts for all data?

cschmid · on May 16, 2021

For a binary classifier, recall is defined as the true positive rate: Of all the samples that are actually positive, how often did our model output "positive".

To get this to 100%, the model would simply say "positive" all the time. This means that the precision (the fraction of samples where the classifier said "positive" that were actually positive) goes to 50% (for a balanced dataset).

Seattle3503 · on May 16, 2021

But it should be possible to get 100% recall with higher than 50% precision if you have a good classifier right?

winstonewert · on May 16, 2021

No. Only if you had a perfect classifier. The only way to have 100% recall is to have absolutely no false negatives. In practice, the only way to have no false negatives is to have no negatives at all.

killjoywashere · on May 16, 2021

I’ve been hearing this since at least 2016. Saliency labs tend to put the interesting areas around the fovea or optic nerve, I think it must’ve been the optic nerve because it’s really where the vessels emerge. My best guess might be that it has something to do with blood pressure. Women have slightly more blood pressure than men in the largest caliber vessel would be the best place to observe that.

DoubleDerper · on May 15, 2021

"EK is a consultant for Google Health. PAK has received speaker fees from Heidelberg Engineering, Topcon, Haag-Streit, Allergan, Novartis and Bayer. PAK has served on advisory boards for Novartis and Bayer, and is a consultant for DeepMind, Roche, Novartis and Apellis. KB has received research grants from Novartis, Bayer. Heidelberg and Roche. KB has received speaker fees from Novartis, Bayer, TopCon, Heidelberg, Allergan, Alimera. KB is a consultant for Novartis, Bayer and Roche. AK is a consultant to Aerie, Allergan, Novartis, Google Health, Reichert and Santen. All other co-authors have no competing interests to declare."

Worth noting.

samatman · on May 16, 2021

This isn't necessarily surprising. Maybe clinicians can be trained to get good at it, but, maybe they can't.

Back before computers beat Go, the conventional wisdom was that it was a hard target because humans can leverage their pattern-finding systems to prune the state space much more efficiently than computers can.

Which got me thinking: what about a game that's the opposite? Like the game's state space is embedded in the timbre of a complex tone, and you make moves by twiddling three or four knobs. It doesn't have to be a very complex game for computers to do a lot better at this, extracting meaning out of white-ish noise is not something we're good at.

Assuming this isn't a data artifact, it might be one of those cases. Some pattern in the brachiation of the veins or the placement of the rod and cone cells, which just looks like noise to us, but which an ML algorithm can find and use to separate male and female retinas.

There's no reason to think that humans would ever get very good at that, although I never count humans out given examples like chicken sexing[0], a notoriously opaque skill which people can nonetheless acquire, despite having little facility in explaining to others how they do it.

[0]: https://en.wikipedia.org/wiki/Chick_sexing#Cultural_referenc...

marcodiego · on May 16, 2021

Reading the title I thought it was a ML model capable to classify retinal photograph from porn.

maCDzP · on May 15, 2021

I am going to drop a thought here to see what happens. If there is a difference between male/female retinas. Could this affect our perception of reality?

BurningFrog · on May 15, 2021

My guess: Only in trivial ways.

The fact that women see the world from a ~20cm lower point probably has real impact.

For one thing, guys, your nose hair is very visible from that height.

NaturalPhallacy · on May 15, 2021

A few funny anecdotes in that vein.

A woman complained that her very tall boyfriend had hung a mirror in the bathroom. She took a picture of it. It was her reflection holding the camera level with the top of her head, and little else.

One that happened to me personally. I asked my then gf (5'1") what it was like being a small person, do you feel like a normal sized person in a land of giants? "Yes" was the instant response.

One very tall guy once remarked: "The tops of your fridges are fucking disgusting."

xorfish · on May 15, 2021

> A woman complained that her very tall boyfriend had hung a mirror in the bathroom. She took a picture of it. It was her reflection holding the camera level with the top of her head, and little else.

As a tall guy, there is a surprising number of bathroom mirrors where my reflection doesn't include my head.

gweinberg · on May 16, 2021

You people need some taller mirrors. The bottom of my bathroom mirror is several inches blow my waist. The top is about a foot and a half above the top of my head.

wearywanderer · on May 15, 2021

> If there is a difference between male/female retinas

Is this even an "if"? It's well established that men are more likely to be colorblind, and it's likely many women are tetrachromats (most people are mere trichromats). The genes for the extra cone pigments are in the X chromosome, and are seemingly expressed more often when somebody has two X chromosomes. Similarly, people with two X chromosomes are less likely to be colorblind because most forms of colorblindness are caused by defects in genes in X chromosomes.

https://en.wikipedia.org/wiki/Tetrachromacy#Humans

https://en.wikipedia.org/wiki/Color_blindness#Genetics

Perhaps most men and women, men and women with normal trichromatic vision, have identical retinas. But with genes so important to eyeballs residing in the X chromosome, who knows. But I'm left wondering why experts are particularly surprised by this result.

LeoPanthera · on May 15, 2021

Surely not any more than my terrible eyesight does. I don't think my spectacles are altering my perception of reality.

Hoasi · on May 15, 2021

Mine certainly do, because without them I would be blind.

astrange · on May 16, 2021

They hurt peripheral vision if you get the kind with thick frames. You might be eaten by a lion.

hervature · on May 15, 2021

I don't like philosophical questions like this. Let's say male blue is female red and vice versa. Our perception of the world is different yet it doesn't change anything as to how we understand and interact with "reality".

CyanBird · on May 15, 2021

It is well backed up by science that women can see or at least perceive/brainlog a stronger variety of colors than men, and then this is expressed on women having a stronger beefier vocabulary when it comes to naming and identifying colors

So yeah, that's a thing

Also, what you are describing is called Qualia, and that is intangible qualities of how the brain processes data, such as the "yellowness of a lemon", or the "foot pain of stepping on unexpected rock shoeless"

Qualia can't be verbalized or compared between people because it is an inherent "brainfeel", you just need to expect others to have "at least similar-ish" qualias

hervature · on May 15, 2021

Right, and children hear much higher frequencies than the rest of us. Just because you see more doesn't fundamentally change how we perceive reality. Like if someone says there is a color between eggshell white and snow white, I believe them because there is obviously a gradient there. I don't need to see their reality to agree on the state of it.

bobthechef · on May 15, 2021

What's "fundamental" in this respect?

If someone is colorblind, and another isn't, does that entail a change in perception? Sure. It means the colorblind guy can't discern things that people with normal vision can.

A person born blind can't see anything and never has. They don't even imagine visual images (only images informed by the remaining senses). Their perception is unimaginable to me and mine to them.

So if women can discern more colors than men, it follows that they experience more colors which seems like a matter of perception. Have you never argued with a woman about the color of a sweater?

frankenst1 · on May 15, 2021

Obligatory XKCD on the gender differences in color labelling: https://i0.wp.com/imgs.xkcd.com/blag/doghouse_analysis.png (from: https://blog.xkcd.com/2010/05/03/color-survey-results/)

username90 · on May 16, 2021

To me those colors are in order:

Red, Brown, Blue, Gray, Blue, Green, Yellow, green, Gray, Blue. Those linear color spectrums never made sense to me as a color blind person.

Technically I only see blue and green, but since people call different shades of green so many different things I start to call them stuff like red and brown. So dark green is brown, then as you go brighter it becomes red, orange, green and brightest green is yellow. White is blue + green, and since red is green pink is just white with less intensity, so gray.

Edit: Anyway, most men are one chromosome color better than me. Most women are two chromosomes better at distinguishing color than me. Makes sense then that women are way better at telling them apart.

maCDzP · on May 15, 2021

Yeah - you are right. I thought about it and I guess since we can interact our perception can’t be to far off - otherwise we wouldn’t be able to procreate. It would be something out of the hitchhikers guide to the galaxy. A specie that because of a retinal differences between sexes is unable to mate.

vmception · on May 15, 2021

I would argue that it does and that we conform behaviors to a standard, but there are alot of assumptions we make that lead us to not understand each other at all

There is a shared experience isolated to one sex that the other cannot perceive

stirfish · on May 15, 2021

If it were to affect our perception of reality, what differences could we find?

I'm guessing that it would affect our perception of reality in the same way eye color would.

slver · on May 15, 2021

matheuss-leonel · on May 15, 2021

Holy shit

988747 · on May 15, 2021

Clinicians don't care, so they never learn to distinguish. It's that simple.

rubatuga · on May 16, 2021

I think this is an important point to say. We simply have no reason to use retinas to distinguish sex.

Causality1 · on May 15, 2021

Fascinating. I had no idea retinas were sexually dimorphic. I wonder if the difference serves a purpose or is just a consequence of some other adaptation.

slibhb · on May 15, 2021

Neither did doctors apparently. If everything is kosher, they've proved some level of sexual dimorphism and now they can investigate and perhaps find out what it is.

This is an interesting use of machine learning. We (or at least I) normally think of these models as replacing or complementing humans. But using them as a driver for research is cool.

esyir · on May 15, 2021

There's also the risk of severe overfitting to some latent variable. I haven't quite dug into the work itself yet, but it does bring back memories of some case of perfect diagnosis due to hospital documentation process though.

amelius · on May 15, 2021

What network topology did they use? I couldn't find it in the paper.

kevinventullo · on May 15, 2021

In the section Model Training:

“Our deep learning model was trained using code-free deep learning (CFDL) with the Google Cloud AutoML platform ... the CFDL platform provides the option of image upload via shell-scripting utilizing a .csv spreadsheet containing labels ... Automated machine learning was then employed, which entails neural architecture search and hyperparameter tuning.”

Earlier, in the Limitations section:

“The design of the CFDL model was inherently opaque due to the framework’s automated nature with respect to model architecture and hyperparameters. While this opacity is not unique to CFDL, there is potential to further reduce ML explainability due to lack of insight of model architectures and parameters employed.”

Maybe there’s a whitepaper somewhere on how Google’s AutoML works?

teruakohatu · on May 15, 2021

My understanding is that is uses heuristics to try a range of different base models and techniques based on the input data, along with grid searchs to find hyper parameters. It is fairly pricey but works.

It is probably not super exotic, but if they spent enough money optimizing it, it probably has good hyper params.

sinuhe69 · on May 16, 2021

Clinicians are not aware of the distinct retina features because they don’t need to. Everybody can know the sex on the first sight and when in doubt, they simply can ask. I’m certain ML would reveal other hidden, obscure features like this in the future. But that does not mean machine can do some thing people can not as the title might suggest. If people can set their mind to it, they will do. Maybe in much slower pace but they do.

SubiculumCode · on May 15, 2021

As I look at the Figure 2. Region based saliency maps, I notice the high salience regions are the brightest area on the retinal image. I am not sure whether that is just a bright regions of the retina or it is the reflection of the light illuminating the retinal scanner. In any case, it is interesting to me that this seems to be the regions that helps discriminability the most (if I am understanding correctly), which is surprising to me.

sojournerc · on May 15, 2021

I'm curious if a difference in cone density or distribution could be the differentiator.

https://theneurosphere.com/2015/12/17/the-mystery-of-tetrach...

contrarian_5 · on May 15, 2021

thats going to be one of the biggest shocks to society going forward when it comes to the changes that AI bring. there are mountains of data everywhere that is completely overlooked simply because the cost of processing the data is too high. too high to discover patterns/correlation and too high to process in any case.

human beings filter out most of what goes on around them. they dont see the world as it is and their minds dont keep track of physical primitives. their minds abstract the world into larger conceptual parts and track those parts. its not just a question of processing power, its a question of intuitive access. and nobody realizes this yet because the only sentient beings who are around to demonstrate any of this have those filters in place. when the AI comes with all that horse power and with no filters, it will see things all around that we are blind to. it will seem as though it can make impossible predictions. it will seem god-like, even before it graduates to doing something other than simply observing the world.

Kliment · on May 15, 2021

Here's a similar study from three years ago that tried to do the same with clinically relevant measures and got somewhat better results https://scihubtw.tw/https://www.nature.com/articles/s41551-0...

fastaguy88 · on May 15, 2021

Sexy title, but it is unclear that clinicians can’t classify sex from the retina, its just that they haven’t bothered to. And the classification is not that great (<80% PPV on independent data). Clinicians will certainly get much higher sensitivity, specificity, and PPV just by looking at the subject ;)

fxtentacle · on May 15, 2021

Some women have 4 types of color rods, all men have only 3.

galangalalgol · on May 15, 2021

Don't all women have 6 types but usually they all have very similar or even identical frequency response? Only when they have a colorblind gene are they noticeably different.

fxtentacle · on May 15, 2021

I meant this one: https://en.wikipedia.org/wiki/Tetrachromacy

"Tetrachromacy is the condition of possessing four independent channels for conveying color information, or possessing four types of cone cell in the eye."

https://jov.arvojournals.org/article.aspx?articleid=2191517

"12% of women are carriers of [..] anomalous trichromacy."

galangalalgol · on May 15, 2021

Yes, it says it is from carrying a colorblind gene, usually red-green. But they have six copies of rhodopsin encoding dna, usuall 3 of them are duplicates. In red green colorblind only two are duplicates, but there are other types of colorblind. Theoretically they could be hexachromatic

YeGoblynQueenne · on May 15, 2021

From the wikipedia page linked in your comment:

One study suggested that 15% of the world's women might have the type of fourth cone whose sensitivity peak is between the standard red and green cones, giving, theoretically, a significant increase in color differentiation.[23] Another study suggests that as many as 50% of women and 8% of men may have four photopigments and corresponding increased chromatic discrimination compared to trichromats.[24]

It's not just women.

wearywanderer · on May 15, 2021

Nor is color blindness exclusively a male phenomena; in Northern Europeans, 8% of males are colorblind while 0.5% of females are.

Suppose a few more sexual dimorphic traits like this exist in the eyes; perhaps differences that have no practical effect on human vision and have consequently gone unnoticed by clinicians. If the ML model is picking up a few of these dimorphic traits, it could perhaps classify sex with more accuracy than anybody looking at a single trait could. This is pretty standard Bayesian stuff; it's the way basic "Plan for Spam" style Bayesian spam filters work.

doggodaddo78 · on May 16, 2021

The next step in ML is to communicate in English what are the essential discriminant attributes it found.

_gfwu · on May 15, 2021

This is a really impressive result and an interesting result to apply ML to. Thank you for sharing, OP. I'm just wondering if there are any real world applications of why you'd want to tell the sex of a person by a retinal photograph? It seems like a bit of a useless skill to have?

qayxc · on May 15, 2021

I think this more an example how black-box models are basically useless for clinical research.

The authors aren't aware of any distinguishing retinal features between male and female eyes and the model itself has no explanatory power.

Could be a Clever Hans situation where the model exploits meta information of some kind in the absence of actual features. It could just as well mean that there are indeed distinguishing features that are compromised in the presence of foveal pathology.

The authors note that another study using manually selected features identified three features that are indicative of genetical sex. These features yielded about 0.78 AUROC accuracy measure. Compared to the presented model's AUROC accuracy of 0.93 that's only 19% worse and these 19% additional accuracy may point to a combination of the already identified features or one or more additional features.

I personally find this paper rather pointless. It stops at the point where actual progress could be made and things would get interesting - why didn't the authors evaluate the previously known features on the model's matches to measure their significance?

This could have told them whether their black-box was relying on the same set of features as the ones identified by previous work, for example.

Blikkentrekker · on May 15, 2021

> I think this more an example how black-box models are basically useless for clinical research.

A result different from the null hypothesis is useles.

Let us say the machine could not succeed greater than chance, it would be a case of cosmic bad luck in that case, that all social and biological factors cancel each other out.

Thus, in the case of the null hypothesis being confirmed by this, one may conclude that in all likelihood retinal patterns have no sexual divergence.

But, machines very rarely find such a null hypothesis in such cases, and that might be in no small part because of all the extra factors that such models latch on to.

I for one would be more interested in an obvious nonce test to see if the machine can find something: see if the a.i. can find a retinal difference between, say, the poor and the rich. If it can with high accuracy what retinae are poor, and what are rich, we might have a somewhat interesting situation.

sxg · on May 15, 2021

(My mistake, missed their external validation)

Isinlor · on May 15, 2021

They did external validation.

> External validation was performed on the Moorfields dataset. This dataset differed from the UK Biobank development set with respect to both fundus camera used, and in sourcing from a pathology-rich population at a tertiary ophthalmic referral center. The resulting sensitivity, specificity, PPV and ACC were 83.9%, 72.2%, 78.2%, and 78.6% respectively

cerved · on May 15, 2021

[flagged]

spuz · on May 15, 2021

The paper is about 4 pages long - it takes about as long as it to you to write that comment as it does to skim through and learn that what you mentioned is exactly why they did the study:

> While our deep learning model was specifcally designed for the task of sex prediction, we emphasize that this task has no inherent clinical utility. Instead, we aimed to demonstrate that AutoML could classify these images independent of salient retinal features being known to domain experts, that is, retina specialists cannot readily perform this task.

It always amazes me how people spend 5 seconds reading a headline but think they know more than someone who has spent days and months on the same topic.

cerved · on May 15, 2021

[flagged]

spuz · on May 15, 2021

Sorry I misinterpreted then. I thought you were dismissing it out of negativity but actually it's worse - you actually made a judgement that you knew more than the authors of the study.

cerved · on May 15, 2021

The only judgement I made was to not read the whole paper. I read up until the paper stated that classifying sex based on retinal pictures was unlikely to be clinically useful. At which point I lost interest.

Why wasn't the ML model and clinician classifying something that actually is clinically useful?

If it has no clinical significance, what's the relevance of the classification of the clinicians?

How is it any more spectacular than beating a random classifier?

Had these points been addressed at this point I might have continued reading

scarnak · on May 15, 2021

[flagged]

cerved · on May 15, 2021

Because I had already spent time reading and maybe someone could enlighten me as to why it in fact is interesting. That and I was also hoping to get insulted

Klinky · on May 15, 2021

I'd agree that if clinicians haven't been trained on this for their line work, then the comparison is not fair, but I wouldn't go so far as to say it's "useless".

cerved · on May 15, 2021

They wrote in the paper it's useless

Klinky · on May 15, 2021

Not having obvious clinical utility at the moment doesn't mean it's outright useless.

cerved · on May 15, 2021

No, you're right. But since there's a whole field on the subject I figured they could have chosen something with clinical utility and I don't really understand why they didn't

zephyr____ · on May 15, 2021

[flagged]

stirfish · on May 15, 2021

>It is scientifically proven that there are no differences between men and women.

https://en.m.wikipedia.org/wiki/Sexual_dimorphism

rs999gti · on May 15, 2021

> It is scientifically proven that there are no differences between men and women.

XY and XX chromosomes don't matter? Then what are all those fertility doctors doing?

claudiawerner · on May 15, 2021

Don't feed the troll.

StreamBright · on May 15, 2021

ML model can't explain how it classifies retinal photographs, clinicians can.

stewbrew · on May 15, 2021

"model was trained on 84,743 retinal fundus photos from the UK Biobank dataset. External validation was performed on 252 fundus photos"

Is it common practice in this field to test an overfitted model's performance with such a small data set so that the test could yield random results?

hervature · on May 15, 2021

The probability of a 50% guess getting 75% accuracy (I believe the paper is something like 77.2%) on 252 trials is 1 in 10^15.

stewbrew · on May 16, 2021

Good point but you should probably use the prob that it's 75% or higher.

drdeca · on May 15, 2021

Are you asserting that it is overfit?

Also, before they tested on the other smaller dataset from a different source, aiui, they also trained only on the earlier subset of the first source, and used the later portion from the first source (with no overlap in patients) for the testing.

(also, I'm not sure that 252 is really all that small?)

make3 · on May 15, 2021

Why do you say that the model is overfitted? You have no way of knowing that. Plus, 84 743 is a very reasonable size for a vision dataset with a binary prediction

avalys · on May 15, 2021

That was not the only validation set they used.

chx · on May 15, 2021

They claim they are able to detect gender which according to the relevant Canadian government website https://cihr-irsc.gc.ca/e/48642.html

> Gender refers to the socially constructed roles, behaviours, expressions and identities of girls, women, boys, men, and gender diverse people

First it's not some hamfisted mixup of sex and gender:

> Terefore, this feld may contain a mixture of NHS recorded gender and self-reported gender. Genetic sex in the UK Biobank was determined

And yet:

> Predicting gender from fundus photos, previously inconceivable to those who spent their careers looking at retinas, also withstood external validation on an independent dataset of patients with different baseline demographics Although not likely to be clinically useful, this finding hints at the future potential of deep learning for the discovery of novel associations through unbiased modelling of high-dimensional data.

If we had a way to detect trans children, for sure that would be clinically useful!

Edit: as always, thanks for the downvotes, but please also educate me where I am wrong.

_Nat_ · on May 16, 2021

They're claiming that they can predict sex, not gender.

The study does comment on a trans case in their validation-set, which of 1,287 images, had 1 image for someone whose genetic-sex and reported-sex didn't match. For that 1 image, the algorithm's prediction corresponded to the genetic-sex rather than the reported-sex.

> Genetic sex was discordant from reported sex in one validation set image, and this image was incorrectly predicted by the model; that is the model predicted sex consistent with genetic sex in this case (Table S1).

Table S1 can be seen in the Supplementary Information linked near the bottom of [the article](https://www.nature.com/articles/s41598-021-89743-x). It really doesn't show much data though.

---

As for the excerpt you quoted about predicting gender, the authors were writing about the claims from [a different (and pay-wall'd) study](https://www.nature.com/articles/s41551-018-0210-5).

Blikkentrekker · on May 15, 2021

Language belongs to everyone. Some specialists in their field make a distinction between “sex” and “gender”; others for instance do between “speed” and “velocity”, or “weight” and “mass”, vernacular, and in many fields, they are respective synonyms.

Most technical terms start as lay terms in a language that are then given a more technical meaning, often pulling two synonyms apart in the process.

On the note of “children”; I tried scanning the result for whether the machine can distinguish before puberty, which would be even more spectacular, but I couldn't find it in the article. — there is significant debate as to what extent non-genital sex characteristics exist before puberty, as the difference is often so small that they could easily be attributed purely to environmental or social factors.

chx · on May 16, 2021

The vernacular, in this case, is obsolete. It needs to catch up. Sex and gender while often happens to be the same are not at all the same thing and we need to speak up against confusing the two for the sake of our trans friends.

Blikkentrekker · on May 16, 2021

The same can be said for “speed” and “velocity”, “mass” and “weight”, “g.p.u.”, and “graphics card”, “working memory” and “r.a.m.”, and so forth.

But most people will live and die without such a distinction ever being relevant in their lives.

Methinks that this distinction plays an important role in your life, but you must realize that it does not in that of most.

The other difference is that of all the other things I mentioned, the distinction is of a very technical and exact nature, whereas “sex” as is common in biology is bereft of a technical definition, and “gender” as is common in psychology even more so. — I initially used the phrase “technical term”, but I am honestly loathe to do so for concepts so poorly defined as either “sex” or “gender” whereof specialists very frequently disagree on wherein to place objects discussed.

chx · on May 16, 2021

> But most people will live and die without such a distinction ever being relevant in their lives.

Lives is the keyword. speed vs velocity is of concern to physicists but mixing up gender and sex has been weaponized as a tool against trans people and because of that, we need to push back.

> Methinks that this distinction plays an important role in your life, but you must realize that it does not in that of most.

Yes, most people don't give a damn about other people, I know, after all, Atlas Shrugged is popular in the United States. This is why the United States is heading to, if not already there, to be a failed state. But those who actually care about others, recognize how much a difference they can make by deliberately using two such simple words in their right meaning and so they do.

Blikkentrekker · on May 16, 2021

> Lives is the keyword. speed vs velocity is of concern to physicists but mixing up gender and sex has been weaponized as a tool against trans people and because of that, we need to push back.

Such can be said about many such words. — various terms that enjoy more præcise nuance in linguistics have very much been used to weaponize against allowing people to speak in their native registers, similar things can be said about religious nuances. So do you also go about correcting people who use the word “Flemish” as most use it, and rather insist that in technical terms, it only refers to a specific group of Dutch dialects rather than standard Dutch as spoken in Belgium?

> Yes, most people don't give a damn about other people, I know, after all, Atlas Shrugged is popular in the United States. This is why the United States is heading to, if not already there, to be a failed state. But those who actually care about others, recognize how much a difference they can make by deliberately using two such simple words in their right meaning and so they do.

Yes, and I would submit that neither you do, but that you simply insist that a special exception be made for the one thing that seems to be important to you, for if you evenly applied this standard throughout your life, and demanded the same corrections elsewhere, you could not generally allow someone to finish a single sentence without demanding several alternations to the words.

In about every sentence vernacularly spoken, words are used that have a more præcise nuance in technical vocabulary, and in many such cases the lack of distinction has been weaponized, for vagueness is indeed the ultimate tool in politics.

spankyspangler · on May 16, 2021

I disagree. I think rather that people need to be tolerant and accepting of different peoples' definitions of words.

chx · on May 16, 2021

No, that is past, we need to do better. Tolerance has its places but sex and gender not being the same is not controversial any more. It's not like mass vs weight which is mostly of interest to physicist. We are talking about real people. Truly, there's no place for this any more. We need to do better. Tolerance is necessary to accept people not like us and but erasing their very existence by using certain words is not tolerance.

https://www.theguardian.com/culture/2019/jul/26/gender-revea...

https://slate.com/human-interest/2016/05/gender-reveal-celeb...

> the gender-reveal phenomenon pulls off a rousing counter-progressive two-for-one: weapons-grade reinforcement of oppressive gender norms (sorry, feminists!) and blunt-force refusal of the idea that sex assigned at birth does not necessarily equate with gender identity (sorry, trans-rights movement!).

This is how the vernacular gets weaponized.

spankyspangler · on May 16, 2021

I disagree. We need to do better with tolerance and acceptance.

Preaching these things when it benefits a favored point of view and then rejecting them when it comes to others is actually the definition of weaponizing, and it does more harm and causes everyone to be intolerant.

chx · on May 16, 2021

What exactly do you want me to accept? Transphobia? Make no mistake: when you confound sex and gender that's what you are fueling. One half of the USA is busy enacting transphobic laws, we must not fuel that agenda. We can not just accept behavior singing "tolerance". That's not how this works. Tolerance means we respect every human being equally and it does not mean that any opinion denying that needs to be tolerated.

https://medium.com/thoughts-economics-politics-sustainabilit...

spankyspangler · on May 16, 2021

No. Be tolerant of other people's own definitions of words.

chx · on May 16, 2021

You can not just go around and redefine words especially when this tolerance hides transphobia. Again, it just doesn't work this way. These words have a meaning, a very specific meaning at that and I don't think transphobic dogwhistling is something worthy of tolerance. You know, sex, gender, same thing, we can mix it up, wink, wink? Even if you didn't intend it that way, even if you think it's still just ignorance/lack of knowledge, time is up for that. This is why I said the vernacular is now obsolete.

What made it obsolete? Obviously, the GOP did by declaring open season on trans people, especially by denying treatment to trans children. This is sheer evil that will lead to suicidal children -- they might as well just shoot them, same result, just less acceptable (although how much is an open question given how American society have normalized school shootings as well but now I really digress). These laws fit the very definition of genocide as defined by the convention. We can not stand for this. We need to fight back. Every day. Every sentence. Every word if need to be. We need to show that aside from a far right fringe, society stands with trans people. It's civil solidarity. And yes, I asked on /r/trumpsupporters what they think is the difference between society and just living next to each other and of course the reply is there's no difference. But there is. We do live in a society. We do care for the next person and if politicians want to erase we will not let that happen. We can not let that happen.

Do you have a workplace policy where people put in pronouns right in their chat name? We do that because we want to normalize that when you introduce yourself, your pronouns are a normal, everyday part of that.

Do I need to quote Niemöller ?

First they came for the socialists, and I did not speak out— Because I was not a socialist.

Then they came for the trade unionists, and I did not speak out— Because I was not a trade unionist.

Then they came for the Jews, and I did not speak out— Because I was not a Jew.

Then they came for me—and there was no one left to speak for me.

So we do speak out because we learned from history what happens when do not. Especially when all it takes is such a tiny thing as getting the words sex and gender right and using the pronouns a person prefers.

If someone uses the n-slur on a black person, are you just going to preach tolerance of their words or are you going to step up, use your white privilege (most of HN readers are white cis het males) and say "that's not okay"?

They elected an openly nazi person to the highest office of the United States and when he rightly got the boot, they attempted a coup to keep him in power. And one of their chief weapons indeed were -- as they called it -- alternative facts. Yeah, there is a certain part of society which does not want to be part of society and does use words and facts rather arbitrarily. Is this what you want me to accept? Because I will not.

I have not started this, I have not wanted this but if history calls us to task, answer we will. And again, right now, in here, we just need to make sure our speech is precise. Not more. It's a small act but it has serious weight.

spankyspangler · on May 16, 2021

One certainly can be tolerant of different people with different ideas, and one can practice what they preach about to.

Having a different personal definition of a word, or a different opinion or thought does not automatically make someone a -obic, or mean they are "coming for" others. Try to be inclusive and tolerant rather than hostile and divisive and uninclusive.

chx · on May 17, 2021

[flagged]

dang · on May 22, 2021

Please don't perpetuate flamewars on HN, and especially not on classic flamewar topics. It does no good for anyone, and it destroys the curious conversation that this site is supposed to exist for.

https://news.ycombinator.com/newsguidelines.html

spankyspangler · on May 20, 2021

[flagged]

dang · on May 22, 2021

We've banned this account for breaking the site guidelines and ignoring our requests to stop.

If you'd please not create accounts to break HN's rules with, we'd appreciate it. Everything here is in there for good reason: https://news.ycombinator.com/newsguidelines.html.

DanBC · on May 17, 2021

> > Terefore, this feld may contain a mixture of NHS recorded gender and self-reported gender. Genetic sex in the UK Biobank was determined

NHS recorded gender can be self reported. https://pcse.england.nhs.uk/help/registrations/adoption-and-...