A lot of what Sedghi is saying about definitions of fairness echoes my own research regarding how different academics define algorithmic bias. There is a large class of academics who would say that algorithmic bias is simply a data problem but I believe if we ignore the societal element then we don't properly account for many examples of algorithmic bias.
From the article: "What this means is that, we calibrate classifiers parameters such that it has the same acceptance ratio for all subgroups of sensitive features, e.g. race, sex, etc."
But statistics such as crime rates and default rates are NOT the same across race and sex. Is she saying that models should be fiddled with to make the same average prediction by race, sex, and other demographic variables?
No, she's saying that if the input data is itself skewed, then one method of fixing the model you generate is to apply corrections to the calibrating parameters.
Imagine we have a 50/50 population of purple and orange people. We know that purple offenders are caught twice as often for a certain crime, and 10 purple offenders are caught and 5 orange offenders are caught. If we run our data and our model predicts that we're getting substantially more purple offenders, then we know our model is wrong - it isn't taking into account the skew due to differential enforcement.
How do you fix this? She gives 3 methods. 1) Change the params to normalize, 2) resample and get better data, 3) use causal reasoning and other data to plug in the holes.
The data[0] shows that the race distribution of the assailant in crime reports is about the same as the race distribution of arrestees. There's no support for the differential enforcement hypothesis.
> The data[0] shows that the race distribution of the assailant in crime reports is about the same as the race distribution of arrestees. There's no support for the differential enforcement hypothesis.
Unless the reports are also influenced by race, such as a police officer not stopping a white person who was a criminal because he/she is white, but over-inspecting non-white people.
Exactly. Let's say cops stop 10% of blacks for "looking suspicious" (aka, being black), but only 1% of whites.
Now say that 10% of both whites and blacks are criminals. And they always get identified if stopped, and eventually arrested and sentenced. An that the population is evenly split between blacks and whites.
(1) Criminal prevalence: 50% B, 50% W
(2) Police stops: 91% B, 9% W
(3) Criminals caught (reports): 91% B, 9% W
(4) Incarceration: 91% B, 9% W
Now remember that only (3) and (4) are available to the public, so one might look and say that the problem isn't that blacks are incarcerated differently, but that blacks commit more crimes, when they don't.
But in reality not all feature dimensions are orthogonal. You can't make the same 'adjustment' for any potential trait bias. Correcting for the purple/orange thing might skew the data on square and round heads, and correcting for that might skew it on pointy and rounded ears and so on and on.
The intention is laudable, but many a road to hell is paved with good intentions.
So they got to balance data on all sub-combinations of features "orange round head with pointy ears" vs "purple square head rounded ears", etc - which are exponential in the number of features. The curse of dimensionality.
How do you quantify "skew due to differential enforcement"? The proposal is to posit some value for "skew due to differential enforcement", then massage the results to match the baseless bias.
Isn't that the logical consequence of having laws that forbid disparate impact on protected classes?
One could also argue that a person's attributes that are not the result of choice should not be used to assess them, because they reflect historical and environmental influences rather than personal ones.
It may not be “fair” in some cosmic sense, but traits that you don’t have control over are important for lots of things you want to assess. For example, should modeling agencies not take into account physical appearance?
Additionally, this extends to the usage of latent or inferred variables that behave as a proxy to a protected class - so neural networks and other discriminative approaches outside of a distinctly diagnosable decision tree, could be in a rather grey area legally if used for credit scores, etc.
It's possible that the upshot of a legal challenge could be a crippling cap on the accuracy of models as they explicitly avoid any information derived from any protected factors.
A person is not a statistics and that's exactly the crux of this issue. It is a human who ultimately decides what features classifier has access to. If you decide to include race, nationality and sex then classifier would use that and may predict that person of race X has Y probability of committing a crime. So now when you go in the field and run your classifier on all people of race X, classifier will say ALL of them are very likely criminal even if a person is a professor at Harvard or has won a nobel peace price or charismatic leader or a philanthropist or just small restaurant owner.
The first issue is that humans are about the most complex objects that we know in universe and trying to make predictions on their behaviour and long term fate would require complex and very detailed data on each of them - if at all its possible. Trying to predict behavior just based on age, race, sex is mind bogglingly dumb, unjust and irresponsible. Trying to predict criminal behaviour itself is unethical because it may pre-disposes large number of people as future culprit without any evidence even if they are perfectly normal people trying to care for their families and peacefully live their lives.
In my view, any classifier built for predicting human behaviour should not use attributes of a person that they have no control over whatsoever, such as race, age and sex. Doing this usually is very tempting because your classifier might start giving better accuracy on test set but one always need to think about the cost of false positives. Each false positive is ruining an innocent hopeful life without any wrongdoing.
False positives in classification or prediction are in practice unavoidable with any probabilistic technique for any non-trivial task. Including certain features where they are forbidden by regulation is obviously not done. The problem arises that your techniques can potentially derive those features anyway from correlated proxies.
Meta-physically you are using a probablisitc inference technique to derive from unknown-knowns in your data a given known-unknown predictor. You can try to normalize for known-knowns, blurring the biases you don't want to account, but you can't control for unknowns.
In some cases it is easy to detect bias in the model when there was discrete data for the feature you try to control. E.g. if your data would have an explicit 'gender' field, then you can test how much the outcome of the prediction is not correlated to the specific value of that field. If the data does not have such a feature, but derives it implicitly though e.g. food purchase preferences, that is much harder to detect. You can try to control for that if for whatever reason you deem the model not in line with bias preferences, but due to the implicitness it will be very hard to isolate the impacts.
I feel you will never get a model satisfactory to all, as in the end what is and what isn't accepted as 'neutral-data' is not a scientific but a political statement. There is nothing wrong with that. All our ethical choices are political. Just don't expect universal acceptance.
If the classifier is being used for medical diagnosis, race, age and sex may be important factors. Ignoring them would amount to systematically ruining innocent hopeful life without any wrongdoing.
The problem usually isn't injustice caused by model errors, because humans make unjust errors all the time; using a model that's better than humans is still a win even if the model isn't perfect. Likewise for medical diagnosis, if a model is better than the average doctor, it should replace them.
The actual problem is that humans mistake the model prediction for something it is not. E.g. if you train a model on the data of people accused of a crime to predict whether they will be convicted, you're modeling P(convicted | accused). Even if you train this model to perfection (zero test errors), you would at best reproduce the existing legal system; which is a win if the model is cheaper. But people tend to mistake the model output for P(should be convicted | accused), which assumes that the legal system is perfect. Or worse, they think of it as P(should be convicted), without accusation.
The solution to your "go in the field and run your classifier on everyone" thought experiment is not to hide race, sex and age from the model. If the model had no access to any features at all, it would just predict the baseline conviction rate, much higher than the baseline crime rate. Using the model output as evidence of a crime would be just as silly, because that's not what the model was trained to do.
The solution is to not even try to predict something for which you don't have any data. Modeling can never make a process more accurate, it can only make an accurate process faster and cheaper (and slightly less accurate). Sometimes the accurate process is to wait for a few years and see what happens, e.g. for cancer mortality or recidivism rates. Sometimes the accurate process is to ask a lot of experts.
Once you have a model for the accurate process that is better than the cheaper process you'd otherwise use, then you can replace it by the model. Instead of a single human judge or a jury, have a model trained on the decisions of multiple judges and juries deliberating much longer than usual. Instead of a single doctor, have a model that has seen the medical histories of thousands of cancer patients and knows exactly which treatments worked.
Attempting to censor nodes in a classifier’s Bayesian network will strictly result in some combination of A) the classifier routing around the damage and B) the classifier producing worse results. Either way, we’ve accomplished nothing. The linked paper, on the other hand, still allows the classifier to be accurate without giving incorrect results across protected categories.
The paper was beyond my level of stats-comprehension, so I couldn't figure out the answer to this question while looking it over. How does their methodology get around the 'redundant encodings' problem?
I am going to assume by 'redundant encoding' you mean a model that takes some non-racial feature like- living in an urban area- and uses that to predict something that is very different across races- say whether or not your loan is approved.
"Definition 2.1 (Equalized odds). We say that a predictor Yhat satisfies equalized odds with respect
to protected attribute A and outcome Y , if Yhat and A are independent conditional on Y ."
This is from page 3. Yhat is the model trained on A (protected class) T (training outcome).
Do you see how if this definition holds there can be absolutely no redundant encoding?
Yes that helped quite a bit. Looking over that section, I thought this summarized it quite well:
"For the outcome y = 1, the constraint requires that Yhat has equal true positive rates across the two demographics A = 0 and A = 1. For y = 0, the constraint equalizes false positive rates."
I thought more about your question (at least what I thought it was) and it wouldn't necessarily prevent redundant encoding but it would sort of restrict how 'damaging' such an encoding could be (if that makes sense).
This whole field is very new but very exciting and very troubling.
Like- what is fairness really? Its an intersection of philosophy/ethics and very UN-intuitive mathematics..there are many open questions
I feel like the implications of what she was saying are a little vague, so I'm not sure what she meant here.
I think the main point is that this is still early days for defining a mathematical measure for 'fairness'.
Wouldn't it be better to have a model that could tell whether an individual will be more susceptible to defaulting or crime based on their context, and not strictly because of their gender or race? I think that is the point here, because otherwise it would be a biased prediction.
I don't understand what you are proposing here. The way things are done right now is we define some specific features which can include things like age, height, eye color, skin color, default history, average time a person wakes up, whether they prefer pancakes or waffles, income, and pretty much anything else you come up with. Then, we obtain a training set containing instances of the specific values of the features paired with the output variable we want to predict. Then, we pick a model and train it using this data. After training, we can test the model on some new data and measure its accuracy, that is, how often it is wrong, what is the direction of the error (i.e. bias), if it's wrong, by how much (variance) etc.
Some of the variables might turn out to be more predictive, and some variables probably quite a bit less. Some variables might turn out to be predictive even though one would guess they shouldn't. For example, people with brown eyes are more likely to be victims of crime than people with blue eyes. It might seem like a mystery, until you realize that black people overwhelmingly have brown eyes, and they are more likely to be a victim of crime than white people. Here, removing race as a variable will only make you and your model more confused.
It is important to realize that removing predictive variable will not necessarily make your model less "biased" -- in fact, it is likely to make your model more often wrong than before, and therefore less useful. The difficulty here stems from the fact that often laws require you to make "non-discriminatory" decisions, and if you attempt to do that, for example by introducing counteracting bias to the bias you get from reality, your model will be less predictive and so less useful.
Fairness in general is very complicated issue. Suppose our model tells us that blue agents are much more likely to default on loan, and this is in fact the underlying reality. The creditor, to maximize his profit, should therefore charge blue agents higher interest than red agents. If you are blue agents, this is obviously unfair for you -- it's not your fault some other agents who share similar characteristic default more often. However, if we have laws forbidding discrimination based on agent's color, the creditor must then charge both colors of agents equally, and since one is more likely to default, this interest rate would be higher than what red agents would pay if the anti-discriminatory laws weren't put in place. Isn't it unfair towards red agents then? After all, it's not your fault that some other agents who share some characteristic default more often...
>>Suppose our model tells us that blue agents are much more likely to default on loan, and this is in fact the underlying reality.
That's a cute thought experiment. Unfortunately, it is assuming human behaviors are caused by skin color. I do not believe that is 'the underlying reality'.
The experiment doesn't mention humans, nor talks about anything being "caused" by skin color -- agents don't have skins, for that matter, they have steel armors. Talking about humans here is counterproductive, because people tend to get quite emotional and irrational when talking about these topics, so it is good to focus on abstract examples.
You might ask what is the reason the blue agents have higher default rate, and this might provoke an interesting discussion, but it does concern the creditor only as far as the interest rate they need to charge to make money is concerned. The creditor does not care if there is some "fair" or "unfair" reason why blue agents default more often, they just want to make money by focusing on what they know best, which is giving loans and collecting interest.
Yet on average, it might be the fact that if you picked a blue agent with a loan, that loan would be more likely to default, than a random loan from the other agents.
I know correlation != causation, but we're using imprecise language where these mathematical distinctions are mired and muddled. People don't think like statisticians, they think like humans who label things and create biases based on these averages without looking at the individuals. If we build these intricate models for predictions that are statistically minded, are we improving our predictive ability or are we just regurgitating the same flawed biases that we ourselves hold? (When I say flawed, I mean unfair.)
What if blue agents were the victims of institutional racism for hundreds of years? Obviously, their lack of opportunities will have hampered their abilities to earn high credit scores and obtain legitimate forms of income. What then? It's as if were building models that measure agents who exist in two separate contexts predicting which one belongs to the 'socially acceptable' context, aka, which one belongs to the class of agents who were the perpetrators of social injustice for centuries. Is that really what we want to predict? No, we're trying to predict how trustworthy an individual is. This might mean that our standard techniques need reassessment if we're going to build models that affect the lives of future generations.
I don't think banks are trying to asses trustworthiness. I think they are trying to asses how likely they are to make profit.
What happens if more people of a group more likely to default take out loans? More of them default, further harming that group.
We do need to address institutional racism, I just don't think asking banks to behave irrationally will be effective in addressing it (or good for either party).
If the techniques in use put emphasis on correlation over causation then the results will be unfair.
But once you put causation as the primary driver, is machine learning still powerful? My understanding of a causative model is that you start with a model that includes information sourced from real-life observation, then fit parameters from the data. That doesn't sound like machine learning to me.
Isn't the point of the various machine learning methods to find obscure correlations? They will by really effective at picking up proxies for race and gender if race and gender are correlated to a statistic. The 'machine learning' part is irrelevant, the question here is an age-old one related to using data to make decisions.
I think you are overstating the purpose of machine learning quite a bit.
What machine learning does is function fitting, no more, no less. Whether this is causal, correlative, obvious, or obscure is irrelevant to the algorithm.
All it does is try to find parameters within a model function, which provide the best predictive power.
That's not entirely true. Causal inference is an enormous field and is just beginning to become more of a part of mainstream ML. ML is largely "minimize predictive error," but it's not limited to that.
There is an argument to be made though that we should be looking for causal connections rather than correlative when we think about what fairness looks like. Unfortunately, even with the many recent advances in causal network detection, we're not quite at a point where I would trust causal modeling for this.
Causal factors don't seem to be a road to avoid discrimination.
If we're looking at causal factors for classic examples of potentially discriminative classifiers e.g. loan default risk and crime reoffending risk, then no matter how you slice it the important causal factors for these things aren't only the objective measurements and things under your control but also different factors of influences, upbringing and cultural values. They're not the cause, and likely not the majority of the cause, but they're certainly a non-zero causal factor.
Having "bad" friends is not only a correlation, but a causal factor that affects these things - we're social animals, and our norms are affected by those around us. Would we consider fair to discriminate people in these ratings because of the friends they have? Do we want to ostracize e.g. ex-convicts by penalizing people who associate with them (so motivating them to choose not to associate), even if there's a true causal connection of that association increasing some risk?
Abuse of alcohol and certain drugs during pregnancy is not only a correlation, but a causal factor for these things (the mechanism IIRC was an decrease in risk avoidance and intelligence) - would we consider fair to discriminate people in these ratings because of what their mothers did?
Etc, etc - I have a bunch more in mind. And on top of that, many of these things will (in USA) be highly correlated with race for various historic and socioeconomic reasons, so taking that into account would still harm some races more than others. It seems that it just might be the interest of everyone just to avoid that huge can of worms.
the only way to look for causal connections is to brood force the probability space of correlations with the besy heuristics we can find. You present a false ditochomy, instead the parent used a syllogysm.
It's pretty insane when we live in a time where we are on the brink of a technology that could surpass electricity and we're still selectively acknowledging reality only when it conforms to our own biases.
Step 1? Maybe if you don't want your model predicting crime based on race, sex, or other sensitive features, you shouldn't train your model on race, sex, or other sensitive features.
This approach doesn’t work. If sensitive features are actually predictive (and they very often are), the model will just learn to predict these features based on the non-sensitive ones. For example, if you know that someone is 30 years old, lives in Bay Area, and earns less then 30k/year, if you bet that this person is neither white nor Asian, you are more likely to win this bet.
The key issue is that it's not sufficient. A model that explicitly uses race gives quite similar results to a model that "just" uses e.g. zip-codes and all kinds of other information; many of these things are highly correlated.
I'm sure that facebook could give (if they wanted) a very, very good guess about the shade of your skin simply based on your history of likes, pages seen, etc.
Tangent - on the other side of the spectrum is r/deepfakes. unbelievable how video clips can be altered with ML, destroying the last bit of credibility for any video.
I write a bit more about how to define algorithmic bias here: http://aaronlspringer.com/algobias-overview/