Phrenology was bad because it was scientifically invalid. I don't see any problem with predicting traits from appearances, at least, not intrinsically if the data is accurate.
> Phrenology was bad because it was scientifically invalid.
Well, that's not a very full story. It was scientifically invalid because it modeled mental traits based on the shape of the skull, and that's the important part. If you trained a model to predict traits based on accurate data about skull shape and traits, I'm confident that the algorithm would happily do that. Phrenology would still be bad regardless of what technical methods are used to do phrenology.
What is your argument that it's still bad? That was kind of a non-sequitur. If we have a model that takes in skull measurements and accurately predicts whether you are an introvert or extrovert...what is the problem with that?
There will be no accurate data for this because traits are subjective. Unlike other subjective predictive modeling, like sentiment analysis, I cannot feasibly think of a use-case for this that is not deeply disturbing.
I really don't want my field to regress towards 19th-century pseudoscience when we can empower other scientific advancements that will yield immediate benefits.
> There will be no accurate data for this because traits are subjective. Unlike other subjective predictive modeling, like sentiment analysis, I cannot feasibly think of a use-case for this that is not deeply disturbing.
This is a strange point. Are personality types subjective? Is IQ? Maybe they are, but we still sometimes find them useful to quantify. If it's ok to predict these things from your answers to questions...why isn't it ok to predict them from pictures of your face?
Ignoring the (many) arguments that IQ is not really a great marker for intelligence, there's basically no value in trying to predict IQ based on facial structure. Unless it is quite literally 100% accurate (which it is not), you will end up mispredicting people's "intelligence". Even if this is ultimately just randomly distributed (and it probably won't be), you'll be needlessly treating people unfairly for no reason.
So yes, I think I can state pretty concretely that there's no value in trying to predict IQ from face shape.
> Ignoring the (many) arguments that IQ is not really a great marker for intelligence, there's basically no value in trying to predict IQ based on facial structure. Unless it is quite literally 100% accurate (which it is not), you will end up mispredicting people's "intelligence". Even if this is ultimately just randomly distributed (and it probably won't be), you'll be needlessly treating people unfairly for no reason.
The presence of error in a prediction makes that prediction valueless?
When doing anything prediction-y, you need to take the relative value of false positives and negatives into account.
Can you describe a situation where the harm caused to someone by mis-classifying their IQ is outweighed by the increased efficiency from...being able to more quickly identify someone's IQ? Like, can you even describe a situation where this will be used ethically at all? What use is there for predicting someone's IQ with dubious accuracy, assuming you already have their permission?
Sure. But you also need to account for the actual baseline you're comparing to. Whatever process it might be that's taking someone's IQ as an input is right now being handled in some subjective manner by other humans. Do you think their error rate is zero? These things don't need to be perfect to be good. They just need to be better than what we're doing now. And as per all the recent unrest, it turns out that what we're doing now isn't all that great.
> Whatever process it might be that's taking someone's IQ as an input is right now being handled in some subjective manner by other humans.
Yes.
> Do you think their error rate is zero?
Well it depends. Do I think the error rate of IQ as a predictor of underlying intelligence is flawed? Yes.
Do I think that IQ as a measure of IQ is flawed? Still, often, yes.
> These things don't need to be perfect to be good.
To my point above, I generally am not a fan of IQ in general. Psychology researchers are consistently finding new and interesting ways that IQ tests are socially/culturally/environmentally influenced and that exams may not be fair.
If your claim is that you can somehow train an algorithm to look at someone's face and evaluate their underlying intelligence better than the tools we have today, honestly "delusional" is the first word that comes to mind. That's just not something that currently available tools can do. And thought experiments about a "perfect" classifier aren't interesting when we're discussing applied ethics.
And that comes with the huge, enormous, gigantic, asterisk that you can even measure "intelligence". Like, it comes with the caveat that you can even reasonably define intelligence. We generally measure intelligence as correlated with success at some metric: math problems, visual puzzles, chess, reading comprehension, whatever. If you're contending that a hypothetical model could predict underlying intelligence better than the measures we have today, how would we even know? When training a model you have an evaluation function.
If the evaluation function is flawed (which is essentially your contention, and which I absolutely agree with), the trained model will exhibit biases that reflect the flaws of the evaluation function. An ML model isn't suddenly going to solve the cultural issues we have with measuring IQ. It will encode the same biases that society does, because the model will try as hard as it can to do exactly what the human proctors would have done.
These are exactly the kinds of broader ethical questions that Timnit points out we need to worry about.
IQ isn't the point though. I'm not a huge fan of IQ specifically either. Your original objection was intrinsic to the idea of detecting things from faces. These objections are to the quality of the measures. I'm not going to defend IQ as a useful measure, because it has a ton of problems and that's not my point here.
> If the evaluation function is flawed (which is essentially your contention, and which I absolutely agree with), the trained model will exhibit biases that reflect the flaws of the evaluation function. An ML model isn't suddenly going to solve the cultural issues we have with measuring IQ. It will encode the same biases that society does, because the model will try as hard as it can to do exactly what the human proctors would have done.
Of course. But so what? There are people using IQ now for things. An ML model isn't going to magically make those biases worse, either. What it's going to do is bring them to the surface, so that they are quantifiable and we can actually do something about them.
ML is the solution to the bias problem. Right now these evaluations are being made by other humans. Humans who we cannot statistically debias. Humans who's biases we can't even effectively interrogate. The reason people are making all these memes about biased AI is not that it's more biased than humans, it's that the bias is more measurable.
> An ML model isn't going to magically make those biases worse, either.
Well, actually, research shows that unless great care is taken it absolutely can. If you include for example race as a factor in a model, it can learn non-causal correlations between race and whatever the objective is. This can have a compounding effect in some cases. [0]
> What it's going to do is bring them to the surface, so that they are quantifiable and we can actually do something about them.
I don't follow this. If we have a biased objective function, the model won't surface any biases we weren't already cognizant of in the objective function. And they were already quantifiable: we had a function that we were using to evaluate the model. We could use that same function on whatever non-model evaluation we were doing.
> ML is the solution to the bias problem.
This is basically directly in contradiction to what leading experts on the subject say. ML cannot fix bias in human systems, unless we presuppose that those systems are biased, in which case we can often address the bias in the human systems directly without ML.
> Humans who we cannot statistically debias. Humans who's biases we can't even effectively interrogate.
You can still have decisions be made by objective expert systems without complex ML. If you want to learn someone's IQ, the best way is to debias the IQ test, not to try and infer it from their face bones.
> it's that the bias is more measurable
If we can measure the bias in the output of an ML model, we can equivalently measure the bias in the output of a human system. You're presupposing the existence of some unbiased objective function which we don't have, and that's at the core of the issue.
[0]: https://www.wired.com/story/ideas-joi-ito-insurance-algorith... has a few good examples here, like how naive bail and sentencing models encode racial bias that isn't present in humans. And to be clear the response here shouldn't be "well let's just build better models" but "why do we think a model will improve the situation here at all"? Removing agency from Judges has historically been bad for the average person convicted of a crime. This doesn't mean that individual judges can't make terrible rulings, but that the alternatives are usually worse on the whole.
> I don't follow this. If we have a biased objective function, the model won't surface any biases we weren't already cognizant of in the objective function. And they were already quantifiable: we had a function that we were using to evaluate the model. We could use that same function on whatever non-model evaluation we were doing.
We can actually follow the logic of the model. For instance, you can theoretically de-bias a dataset by building a racial classifier from it. What you need is an objective test for the presence of racial information, and that's easy to obtain: Build a classifier to explicitly predict race from your feature set. Train an adversarial model to reconstruct your dataset with maximum fidelity, subject to the constraint that race can no longer be predicted from it.
> This is basically directly in contradiction to what leading experts on the subject say. ML cannot fix bias in human systems, unless we presuppose that those systems are biased, in which case we can often address the bias in the human systems directly without ML.
These experts are just wrong, then. Naive ML won't fix bias in human systems, but that doesn't mean we can't use ML to fix it, if we do so thoughtfully.
> You can still have decisions be made by objective expert systems without complex ML. If you want to learn someone's IQ, the best way is to debias the IQ test, not to try and infer it from their face bones.
Sure, but there are a lot of things that we don't do in the best possible way because it's too expensive. There are lots of use cases for cheap, scalable, low precision models.
> If we can measure the bias in the output of an ML model, we can equivalently measure the bias in the output of a human system. You're presupposing the existence of some unbiased objective function which we don't have, and that's at the core of the issue.
Right, but we cannot fix the bias in a human. And humans are heterogenous and inconsistent. The same person may be more or less biased on different days. The ML model is consistent, and we can incrementally improve its bias in tangible and testable ways. The same is not true of humans.
And what does this get you? Let's look at a face recognition dataset. What happens when you debias it? Is it still useful? No. Because the faces no longer resemble real faces.
> These experts are just wrong, then
Perhaps, but you aren't making a strong case for that.
> There are lots of use cases for cheap, scalable, low precision models.
That involve facial recognition?
> Right, but we cannot fix the bias in a human
We don't need to. We just need to fix the bias in the system. And we absolutely can incrementally reduce bias in systems that involve humans.
> And what does this get you? Let's look at a face recognition dataset. What happens when you debias it? Is it still useful? No. Because the faces no longer resemble real faces.
Not to you. But you can remove the racial information without destroying all the information that a model can detect.
But when racial information is correlated with the output, to decorate with race, you destroy the input. This is most obvious with a face dataset, but is true with anything race correlated: credit scores, where you live, etc. If you're willing to destroy the training data so it no longer resembles real world information, you might as well just not use it in the first place.
That's what the ethicists say: don't use facial recognition models. Don't work on them. Don't research them. They cannot be both unbiased and useful. And in general, there's few to no uses that are ethical, period.
Well, the ethicists just don't understand the models, then. For instance, there are a bunch of measurements you can take of faces to identify people, if you were doing it manually. Things like pupillary distance, canthal tilt, nose width, etc.
Some of these correlate with race. But only part of the information correlates with race, not all of it. It is, in principle, possible to remove the information that identifies race without destroying the information that identifies the individual. It is true that part of an individual's essential characteristics are their racial characteristics, but it is not true that the only way to identify an individual is their racial characteristics. For instance, there is no way that i'm aware of to infer race from fingerprints, but you can absolutely identify a person by their fingerprints. So, the question is, can we extract a facial fingerprint that identifies a person, but not their race? I think the answer is almost certainly yes, and it is going to be up to a clever model design to do it. But essentially it would look like a GAN where the adversarial component is constantly trying to predict race, while the Generative component is trying to trick the race classifier without tricking the person-identifier.
> Well, the ethicists just don't understand the models, then. For instance, there are a bunch of measurements you can take of faces to identify people, if you were doing it manually. Things like pupillary distance, canthal tilt, nose width, etc.
Or perhaps they understand that this won't work in practice.
I know. My point is not about IQ. My point is that people find IQ to be a useful measure of things, in certain situations. We can use something else, like the big 5 personality inventory, if you like. The point is just that subjective characteristics of a person are things that we quantify and sometimes care about measuring.
And the times when we find ourselves wanting to quantify and measure them in order to make decisions that impact people’s lives them are precisely when we need to be extremely careful, because they are subjective traits.
Of course we should be careful. But we should also keep in mind what the baseline is. The baseline is subjective evaluations by other humans. If we don't build models, it's not like we're going to live in a world where nobody's subjective personality traits are evaluated. They still will be. They'll just be evaluated by other humans in an opaque, un-monitorable way. Using models at least makes it transparent, and something that we can iterate on to remove bias. There are no effective ways i'm aware of to de-bias humans.
You have it exactly backwards though. Its humans that obstruct accountability, because they are opaque and inconsistent. ML is consistent and inspectable. It cannot lie to you about its motivations. It is infinitely more accountable than any human system ever will be.
There are lots of things about appearance that are deliberate signaling. It's all subjective but it's not meaningless or pseudoscience to read those signals.