Bias in data leads to biased probabilities. Bias in data reflects biases in the systems that data describes, and the people describing it.
One could say (and some have said) that the tendency of some facial recognition software not to recognize black faces is not racist - but simply a matter of physics. What's racist is the culture of software engineering which only tests that software on white faces, and doesn't consider it a problem.
Likewise the 'most likely image' of a nurse being a non-white female and a CEO or lawyer being a white male is still a problem even if it accurately reflects racial and gender class role biases in society. Eventually these systems will enforce and maintain power structures and media narratives which themselves are already built upon racial and gender prejudice.
> What's racist is the culture of software engineering which only tests that software on white faces, and doesn't consider it a problem.
That's a pretty cynical take. Every project I've ever worked on, regardless of the problem domain, operated with the 80/20 rule in mind. Solve the easy problems first, get something out the door and then work on the harder problems. If dark skin is a harder problem in image recognition, it doesn't mean a developer is racist for solving the easier parts of the problem first.
The normalization of whiteness as default which leads to the attitude that considers facial recognition software which only works on white faces to be complete enough to ship, rather than fundamentally broken, is where the racism comes in.
Racism doesn't always come from overt bigotry or hatred. It can be expressed by simply accepting the status quo of systemic bias, because it's less work, or more cost effective, than doing otherwise.
> still a problem even if it accurately reflects racial and gender class role biases in society.
Agreed, but. Since the 'culture of software engineering' likely reflects society-at-large (maybe a step up) ... how do you construct training models that reflect a more refined and caring reality than the one we've got?
If what they're showing us accurately mirrors what they're seeing, that's a service. If we find it painful, well, we made it that way.
The obvious solution is to set such results among the short term goals and to treat failure to meet the criteria as unacceptable engineering failures.
As simple as no bonuses, no stock options, no promotions and most importantly no deployment if the model produces undesirable results.
The models reflect what AI ideology directs Amazon Turks to find. It reflects what is considered accurate and does not reflect classifications that might be perceived to put Amazon Mechanical Turk’s contracts at risk.
“We” didn’t make it that way because I know I am not using Amazon Mechanical Turk to classify images.
I mean even if the exploitive wage structure went away, the name itself is consistent with AI ideology that racism, religious intolerance, and nationalism are ok.