>> Rebalancing an imbalanced dataset is common in industry and academicia. You u...

YeGoblynQueenne · on Oct 26, 2019

As to the work being widely discredited, the following is an article that summarises and links to criticisms:

https://greggormattson.com/2017/09/12/tracking-wang-and-kosi...

Some of the criticisms are technical, some are from the point of view of ethics. It would be a grave mistake to discount the ethical concerns, but if you prefer technical explanations there is quite a bit of meat there.

ipsa · on Oct 26, 2019

Thanks! That article has a lot of critique and I also like that the author collected the responses from one of the authors.

But, to me, most of the critiques seem uninformed (not made by ML practicioners) and focus on the ethics (where I agree with the authors: we need solid research into weaponized algorithms and show what is currently possible by ML practicioners, who may use such technology adversarialy, and can look at reclassifying profile pictures to the same degree as we do information about sexuality, religion, or political preference). By my estimation, most of the critiques are by people who find this research to be threatening to them, their friends, and their sexual identity. That may very well be the case, but it also leads people to conclude the scientific study was flawed and that an automated gaydar can't possibly work. Two replications by scientists who took issue with the paper, and lack incentive to fudge the data or metric to dress up their paper, also demonstrated a better than random automated gaydar. These systems work! (And that poses a problem we can now tackle, where before we did not even know this was possible, and the majority in this thread still thinks it is all bunkum).

ipsa · on Oct 26, 2019

Many statistical assumptions are regurarly broken, for pragmatic reasons (it just works better), or because the world is not static (and so the IID assumption is broken). There is an entire subfield of learning on imbalanced datasets, which includes resampling, subsampling, oversampling, and algorithms like SMOTE. It is common to use these techniques to get a better performance, including on unseen out-of-distribution data. Fraud - and CTR - and medical diagnosis models are regurarly rebalanced for other purposes than trying to break assumptions or cheat oneself into a seemingly higher accuracy. Plus, the signal does not dissapear when training only on originally balanced data. These systems do not work by the grace of a rebalancing trick alone, but they may work better (as usually the case with neural nets, which do not even give convergence guarantees: something only a statistician would worry about).

You can switch negative with positive class and my point remains: if the authors wanted the fraudulenty hack the accuracy score, this is way easier with imbalanced data. AUC metric robust to class imbalance anyway: ranking won't change for unseen data out of distribution, you can just adjust the threshold to match it.

I'd say an academic source is necessary in this case, because you implicitly accuse these scientists of doing shoddy hyped up work, with fudging tricks to appear more accurate. I need more than popular media sources or previous HN discussions to admit this paper was "widely discredited".

Your Yuri Geller example is a red herring: one is a stage magician, the other is peer-reviewed science. But to oblige: https://scholar.google.com/scholar?q="yuri+geller"

YeGoblynQueenne · on Oct 26, 2019

Yes, of course many theoretical assumptions are broken- but that is because people who break them either ignore them completely, or deliberately voilate them in order to produce better-looking results. That is more common in industry where it's easier to pull the wool over the eys of senior colleagues, but it's not unheard of in academia, quite the contrary. Anyway, just because people do shoddy work and then report impressive results doesn't mean that we should accept poor methodology as if it was good.

In particular about the gaydar paper, the authors cook up their data to get good results and then use those results to claim that they have found evidence for an actual natural phenomenon (hormones influencing haircuts etc). That's just ...pseudoscience.

Is your google scholar link humour?

ipsa · on Oct 26, 2019

You seem to be under the assumption that rebalancing is always bad or ignorant. That techniques, such as SMOTE, are only used to produce better-looking results and pull the wool over someones eyes. This is simply not true. Rebalancing is not shoddy, but accepted practice. It is certainly fair to question it, but not to draw the conclusion of fraud or shoddy science (without making you look pretty silly).

Again, I do not think rebalancing data justifies the conclusion that the authors were cooking up their data to report better results. Take a step back and assume good faith: could there be any other reasons to resample data, other than wanting to commit fraud?

The Google scholar links includes 10+ cited and peer-reviewed papers on the Yuri Geller drama.

I don't know enough about hormone theory to say anything against or for their conclusion, just focusing on showing that working automated gaydars that perform better than average/random guessing exist and have been scientifically demonstrated. I can agree with you on that the connection is spurious, without dropping my point that this controversial technology actually works (rebalanced or no).