"Given a single facial image, a classifier could correctly distinguish between gay and heterosexual men in 81% of cases, and in 71% of cases for women"
I can guess if someone of gay or straight with 98% accuracy - just always guess they are straight.
Tha ratio of gay to straight in the test data set is perhaps the most important part in determining how well the algorithm actually performs.
Are we supposed to assume the computer was shown a 50/50 split of gay and hetero people? It appears that way. But please, tell us.
If the test data does not have a 50/50 split (or something around that ratio), the headline is straight up lies.
Skewed test data is the most common problem with research, reminds me of this great video by Veritasium (Is most published research wrong?): https://www.youtube.com/watch?v=42QuXLucH3Q
From the author's notes: "When presented with a pair of participants, one gay and one straight, the algorithm could correctly distinguish between them 91% of the time for men and 83% of the time for women."
So yes, 50-50.
I can guess if someone of gay or straight with 98% accuracy - just always guess they are straight.
Tha ratio of gay to straight in the test data set is perhaps the most important part in determining how well the algorithm actually performs.
Are we supposed to assume the computer was shown a 50/50 split of gay and hetero people? It appears that way. But please, tell us.