There was another article recently which argued that because there is a strong correlation between race and default rates, if you apply machine learning to a dataset, the algorithm will find a way to extract what is basically a proxy for the race from the data.
So basically any sort of ML applied to credit data will run afoul of the equal credit opportunity act.
The article also made the point that basically all ML credit scoring startups are illegal because of this, but they get away just because they are small and not on the radar.
I make credit scoring models for a living and I can tell you that in most countries you won't find "proxies" or any other strong variables just by applying "machine learning".
Usually when you try neural networks in this segment you end up with exactly the same variables and outcome as you would with a normal logistic regression with 10x the complications and a lot less stability of the model.
There simply is not enough input parameters that are significant to the outcome.
But it might be in the US where the field is less regulated and you are allowed to collect all kinds of information on the person that you could proxy something like ethnicity although I have not found ethnicity significant in any of my sets. We do get a some variables even if we are not allowed to use them. Again this might be different in the US.
> You can load in tons and tons of demographic data, and it’s disturbing when you see percent black in a zip code and percent Hispanic in a zip code be more important than borrower debt-to-income ratio when you run a credit model.
If you account for most other significant things like income, education, social status, job etc you will find that ethnicity is not significant.
The fact that you can sometimes use ethnicity as a proxy for social status and other things just show the discrimination that happens in some places. But when you set all other factors equal someone from Asia, Africa, Northern Europe will have the same default rate. At least in the (European) countries I've run models in.
People with the exact same FICO score have different default rates, if you manage to bin them by race: Asians > Caucasians > Latin > Black.
> We do get a some variables even if we are not allowed to use them.
Cool, so you had access to an ethnicity variable to measure its proxy power and significance? I feel this is important and very rare outside of Europe.
>any sort of ML applied to credit data will run afoul of the equal credit opportunity act.
That can't be right because the existing and very widely used FICO score is already a rudimentary form of "machine learning" applied to credit data.[1] (FICO secret formula correlates payment histories, credit-card balances, income, etc to calculate probabilities of loan default.) Clearly, automated machine analysis of _credit data_ is legal even though minorities have lower FICO credit scores than whites and subsequently get less approvals for loans.
The paper is talking about something else: the application of ML to non-credit data. Examples of datapoints such as:
MacOS vs Windows
iOS vs Android
GMail vs Hotmail
lowercase vs Proper Case when writing
etc.
Those non-credit datapoints are collectively referred to by the paper as the "digital footprint". E.g. the authors conclude that analyzing data revealed by web browser user-agent strings to calculate a "credit risk" can correlate as well as the traditional FICO score.
The issue you're talking about for Equal Credit Opportunity is if those non-financial variables are surreptitiously used to determine "whiteness" or "blackness" (proxies for race) -- or -- if the data was innocently analyzed for debt-default patterns but nevertheless inadvertently correlates with "white"/"black" and therefore punishes minority groups.
Downvoters: please point out what is inaccurate about my comment. If I made a mistake in reading the paper, I'd like to learn what I misinterpreted.
Its also possible that if you have good data on a person's financial situation and behaviour (and more questionable that of their social circle) that race stops being a relevant signal in the algorithm as it was only ever a proxy for a person's finances and job security.
If there was substantial remaining bias you could probably measure the impact of race on credit while training the model but remove it or consider everyone the same race when scoring customers.
(Disclaimer: I do ML for a consumer credit company, not in the USA).
Is it race or culture? Do you know if recent immigrants show the same "race" signal or not.
I wonder how much race signal remains after accounting for g. I know in a lot of other areas once you correct for g there is almost no race signal left.
During the days of officially/semi-officially-sanctioned racial discrimination in the US, it didn't really matter whether you were a dark-skinned person who was born in the US, or a dark-skinned person born in the Caribbean who immigrated here, or a dark-skinned person born in Africa who immigrated here. The only thing that system cared about was your perceived race, defined primarily by your skin color, and so every dark-skinned person got subjected to discrimination.
With recent immigrants today you can see some differences, but it's not due to immigrants having different "culture"; it's due to the historical baggage of the long, long period of discrimination suffered by folks who were already here. The immigrant probably has the benefit, for example, of an extended family that worked hard to save up and send someone to the US and provide advice and support and broker connections; the US-born dark-skinned person has had their extended family deliberately broken up, and subjected to policies that prevent intergenerational accumulation of wealth or other resources. And that's not the only sort of head start the immigrant gets, which means it should be totally unsurprising if we now see better outcomes on average for recent immigrants.
And now all the assumptions and stereotypes based on perceived race are being used as training data for "objective" ML/AI systems whose creators promise they're free of prejudice...
Are we not trying to determine if these systems are picking up a signal due to race or one due to culture? The only way to do this is look at people who share the same race, but have a different culture?
The historical baggage you describe is culture. We are all shaped by our history and our families history.
The book Weapons of Math Destruction talks all about this. I've come to believe that pure risk shouldn't be the only factor in a person's interest rate.
That obviously makes a ton of sense from a business standpoint. You want to contain losses for risky borrowers but compete with other lenders for low risk borrowers.
But socially, this is perverse. People tend to be risky because they are already poor. So now money costs more for those who have the least of it. This is one of the feedback loops that makes poverty (and affluence, for that matter) so sticky.
I had this realization in my personal experience when I was able to refinance almost $100k in student loans at a crazy low interest rate. My household's finances are in great shape as my wife and I enter our prime earning years. But for us, such an opportunity is a gift, on top of an already sweet situation. The savings could be a game changer for a family whose finances are more marginal.
That's the irony of credit, plenty of it when you don't need it. It seems that the more money you have, the quality of investments that people approach you with increases too.
That's why pretty much the lowest common denominator is fixing the poverty problem manifests as education. Societies that make education one of there main investments will continue to reduce crime rates, increase social safety nets and second chances, and I hypothesize these countries will have most stable positive growth rates in there economies.
How do you operationalize that? If you are in charge of investments for a teachers’ pension fund, do you invest in less profitable banks because they have looser underwriting standards?
You can overdo fairness and cause troubles to the poor. Very concretely: Giving someone a loan, despite their credit score being marginal, will severely mess up their credit score forever, if they can't repay you.
If you are poor, then don't create more debt! It should be hard to rack up such a debt, not easy and accessible. No amount of credit is going to increase your social status, because you have to pay it back with your own current/near-future money.
If we want social justice for the poor through access to more money, then capitalism is not a good way to go. The state should become a credit provider.
How would someone’s credit score be messed up forever? Debts are cleared completely after 7 years and have less weight after about half way through that time. Bankruptcy is along the same lines.
You still owe any debts after 7 years (up to 15 years in some states). Zombie debt is not "completely clear". But ok, read "mess up your credit score for 7 years" instead. Then point remains: Marginal high-risk credit underwriting is not only dangerous to the institution, but also to the receivers of the loans (and the economy in general). It is socially perverse to hook lower-income people to consumer credit. To have middle-income people lose their house.
Problem with utilizing datapoints like digital footprints is that it will run afoul equal credit opportunity act. ECOA was designed to stop banks from redlining neighbourhoods which usually punished minorities. With digital footprints, they'll be in theory redlining their digital including sites, products purchased etc.
The device type (for example, tablet or mobile), the operating system (for example, iOS or Android), the channel through which a customer comes to the website (for example, search engine or price comparison site), a do not track dummy equal to one if a customer uses settings that do not allow tracking device, operating system and channel information, the time of day of the purchase (for example, morning, afternoon, evening, or night), the email service provider (for example, gmail or yahoo), two pieces of information about the email address chosen by the user (includes first and/or last name and includes a number), a lower case dummy if a user consistently uses lower case when writing, and a dummy for a typing error when entering the email address.
Let's play a game. You are in charge of a large pile of cash and want to make it grow by giving loans. Each day, two people apply, and you can give out one loan (you will have to rank the applicants). When people de-fraud you, you lose all of the loan. When people don't or can't pay you back, you lose all of the loan. When people pay back the loan, you make a little money.
Day1: User Agent: iPhone latest vs. Windows XP
Day2: Referral: Facebook friend vs. search "cheapest loans"
Day3: Time of interaction: 21:30 vs. 04:30
Day4: Email: ari.johnson@cs.mit.edu vs. hpqwoovz11721@hotmail.com
Day5: Funnel: Someone who spend 10 seconds vs. someone who spend 10 minutes, made a mistake in the name, entered an email address, then deleted it, and entered another email address at a different provider.
Now if your feeling does not point you to the first applicant every day, you look at the data for guidance. You find that the number of fraudsters and non-pay's is statistically significantly higher for people with the second set of characteristics.
The alternative is to use third-party data providers. That's another can of worms. Or flip a coin and start gambling proper.
That game is common in credit scoring classes for fresh analysts. The class is split into small teams. Each team is given ten anonymized, but real, credit applications and corresponding credit bureau pulls. Five from customers who subsequently defaulted, and other five from customers who paid the loan back. Each team tries to guess which are which. The team that makes the most correct guesses wins.
Then the results are compared with FICO score, and usually FICO is clearly better. Even with people with banking experience on the teams it's very rare to see humans beat the model, partly because humans tend to base their decisions on irrelevant details and their own biases.
Let's change the game. You can allocate an investment to either of two banks. When building credit scoring models, one bank has access to just FICO scores, the other bank also has access to FICO scores in addition to behavioral and signature data. Which bank do you allocate your cash to?
Now change the game so FICO is unavailable: For instance, when micro-lending to third-world country entrepreneurs. Do you still feel these digital signatures are irrelevant to making better credit risk decisions?
All banks already use behavioral scores in credit card line management. Mortgages are a different story because there's little they can do once after the underwriting. That said, independent variables that go into behavior risk scores are not like the ones from the article.
In any case that game is pure gedankenexperiment, at least in the US. In reality banks have to comply with ECOA and host of other rules and regulations that limit the types of data they can use in credit decisions.
There may be more freedom outside the US, but even there social media probably carries much stronger signal.
Let's say you add these digital signature variables to your credit risk scoring model anyway. The model then falls prey to confusing correlation with causation. What happens to the performance of the model?
I have no idea, as merely adding them may have no effect at all.
However, depending on them exclusively (or in substantial/majority part), which I believe is the main premise, the eventual performance will depend entirely on if the the actual causal relationship which created the correlation holds true. If it doesn't, the model would no longer be predictive.
> Let's play a game. You are in charge of a large pile of cash and want to make it grow by giving loans. Each day, two people apply, and you can give out one loan (you will have to rank the applicants). When people de-fraud you, you lose all of the loan. When people don't or can't pay you back, you lose all of the loan. When people pay back the loan, you make a little money.
Actually, many people, including me, played a similar game on Prosper.com (as it was many years ago, not the current version). I can assure you that the reality is unlike your false dichotomy of losing everything [1] versus making a little money.
Even non-performing loans (that were not fraudulent from the get-go or discharged in bankruptcy) aren't totally worthless [2], as there is a thriving business in junk debt, and, presumably, at least some payments were already made.
Interestingly, I still have pennies trickling in from there occasionally, presumably from people who couldn't pay before but now can and are doing so because they know the lenders were individuals and not a faceless corporation.
[1] Prosper had a fraud guarantee, but it was a bone of contention among early users that they did not sufficiently honor it and that they subsequently changed the platform to increase borrower anonymity, thereby increasing the ease/potential for fraud.
[2] How Prosper handled these was another bone of contention.
A lot of potential for injustice via false negative judgements; after all, you aren't doing anything wrong by using a particular OS at a particular time of day, etc.
"We analyze the information content of the digital footprint – information that people leave online simply by accessing or registering on a website – for predicting consumer default."
Wonderful.
Gameable and dystopian all at the same time.
The only thing stopping it in the US is the equal credit act. Most of these “novel” credit scoring solutions are just attempts to work around the race and other prohibitions in credit scoring. The good news is that these things get shut down quickly with enough complaints. This study points out that the digital tracking is likely a violation.
If that is true ("equal credit act is the only thing stopping it in the US"), then race is the strongest factor[1] for credit scoring, so I'm not sure how China does it (Chinese population is racially homogenous)?
The Chinese recognize race within its borders[1], additionally it may be of interest to the scorer to know who is Hui, Tibetan, or Uyghur.[2]. I don't support this I'm just noting that China is not racially homogenous.
The baseline (FICO) has AUC of 68.3% which looks low. This may be because the analysis is performed not on the entire through-the-door population, but only on the customers that passed the creditworthiness check (which is using FICO).
In such situations it is customary to do some kind of reject inference or testing below the cutoff, as well as swap-in and swap-out analysis. It does not look like they did any of that.
I lead the Data Science team at Oakam, a London-based fintech company founded in 2006.
If you find the article interesting, you may also be interested in Oakam's work using alternative data to predict credit default, which was covered recently in The Economist:
If you're a Data Scientist looking to work in this area, or just looking for a new challenge, please contact me (personal email in my profile) so we can have a chat!
We are also hiring software engineers (stack is React Native for iOS/Android, and mostly C# for everything else).
So basically any sort of ML applied to credit data will run afoul of the equal credit opportunity act.
The article also made the point that basically all ML credit scoring startups are illegal because of this, but they get away just because they are small and not on the radar.