Hacker Newsnew | past | comments | ask | show | jobs | submit | emadelwany's commentslogin

Let me try to clear your confusion: an input my seem innocent (e.g. zip code), but a zip code is likely to correlate to ethnicity and race in some regions. So even if the inputs seem legal, an ML model that’s sophisticated enough can derive illegal results that discriminate against certain populations.


This gets at the heart of one of the issues with these discrimination laws.

What if, all else being equal, people from zipcode A are statistically much more likely to default than those from zipcode B? Do financial firms have to pretend like they don't know that fact?

How removed from race does information have to be in order to be considered by a financial firm?


> How removed from race does information have to be in order to be considered by a financial firm?

In general, the standard is "disparate impact"--if you accepted 80% of all white applicants but only 20% of black applicants, then you're probably liable for racial discrimination even if you were completely race-blind.


> In general, the standard is "disparate impact"

That standard is unworkable. What happens when an idealistic charity goes into a homeless shelter in a black neighborhood with a high rate of drug use and helps them all fill out mortgage applications, so that 80% of the lender's white applicants are married with stable middle class jobs and 80% of the black applicants are single, homeless, unemployed and suffering from drug addiction? The institution evaluating the applications may not even be aware of why their applicant population skews that way.


There is a legal defense, where you can argue that the disparate impact is caused by practical business need instead of by implicit bias. An example (using gender instead of race) is that women have less upper body strength than men, on average, so they are far less likely to meet job requirements that require being able to carry 100 pounds of equipment on their backs.

Even then, however, you still have to demonstrate in your defense that the requirements are actually justified and not a backdoor proxy. In the example I gave earlier, you have to demonstrate that employees actually need to carry 100 pounds of equipment on their backs, and furthermore that there is no workable alternative to that requirement. If I were a legal compliance officer, machine learning for applicant screening would scare me, because I would have a hard time arguing in court that the results constituted a legitimate business need and not racism-by-proxy, especially if the plaintiff found that some of the metrics highly correlated to race.


Which is why that standard is unworkable.

If you're giving loans and justifying a factor is the same as showing that it correlates with repayment rate then it's that easy to show a justification, but it will also basically always produce a "disparate impact" because repayment rate itself correlates with race, so any reasonably accurate measure of repayment rate will do the same.

A "disparate impact" standard is useless because it's routinely met even when discrimination is not actually occurring. In fact, finding no disparate impact would be highly indicative that someone was impermissibly taking race into consideration in order to fudge the numbers.

So either you have a disparate impact because that's what naturally happens with accurate predictions, or you do the expressly prohibited thing and take race directly into consideration in order to make it go away.

Imagine a human intervening to remove a factor from consideration because considering it benefits black applicants relative to other applicants, even though considering it improves accuracy overall. Would they not rightfully get their butts sued off? But what happens if it's the same thing, only this time it benefits hispanic or korean or greek applicants relative to black applicants? The only remaining option is to consider all the information you have available, which is what people are inclined to do to begin with.


I'm not sure what you're trying to argue here. Are you saying that disparate impact liability is fraught? Everyone agrees that it is. Cases based on it hinge on whether the impact results from "legitimate business need" with no reasonable alternatives, a standard that expressly allows e.g. credit scores with strong racial or gender correlations.

So if that's what you're concerned about, the response is simple: it's not enough to simply show a correlation; in enforcing disparate impact claims (under ECOA or FHA or Title VII), regulators have to show not just the correlation, but also the illegitimacy of the (facially neutral) action, or at last that some other (facially neutral) business practice would accomplish the same goals without producing the impact.

Since this is HN, I can't discard the idea that maybe you're instead arguing that disparate impact isn't in fact the standard in US law, in which case: no, a simple Google search for "disparate impact" and any of the laws I cited in that last paragraph will quickly disabuse you of that.


> it's not enough to simply show a correlation; in enforcing disparate impact claims (under ECOA or FHA or Title VII), regulators have to show not just the correlation, but also the illegitimacy of the (facially neutral) action, or at last that some other (facially neutral) business practice would accomplish the same goals without producing the impact.

But that's the problem.

Suppose you're evaluating whether to consider zip code in loan approvals, and that doing so improves the overall prediction rate, helps hispanic applicants, but hurts black applicants.

If you choose to stop considering zip code, you've got a disparate impact against hispanics but no business justification for doing it that way, meanwhile they can show an alternative (i.e. taking zip code into account) that serves your business goals better and reduces the disparate impact against hispanics, so not considering it may get you into trouble.

But we also have people suggesting that you shouldn't consider it because it increases the disparate impact against black applicants, and you might "have a hard time" demonstrating that business justification in court. Which is perhaps not ridiculous, because the business justification is there, but it's also in complicated algorithms that are not easy to explain to a layman.

So you have two basically reasonable alternative courses of action, either of which could arguably result in liability. This is the hallmark of unworkable legislation.


> Suppose you're evaluating whether to consider zip code in loan approvals, and that doing so improves the overall prediction rate, helps hispanic applicants, but hurts black applicants.

That's not disparate impact. The legal code does not follow exact predicate logic, where if you meet conditions A, B, and C, you violate the law. It tends to follows rules of fuzzy logic instead--that's why you'll see legal opinions that involve words like "tends to", "probably", "factors" a lot. Particularly where a strict interpretation would lead you to an apparent contradictions, the court system instead tries to find a reasonable course of action. Indeed, often merely showing that you are making a good-faith effort to comply with all applicable laws and regulations is sufficient to absolve you of penalties for failure to comply.

Ultimately, the arbiter of reasonableness isn't a blackbox oracle. It's a panel of 12 members of the general public, or perhaps a panel of 3-9 judges.


Isn't that a bit like asking how many pedestrians you can hit and still keep your license? The right answer is to try for zero.

The dilemma you're describing isn't due to the law itself, but rather because the difficulty of writing law results in only the absolute worst abuses being criminalized. The abstract safe answer is to not engage in group-based discrimination at all, regardless of it seeming quite lucrative to hire a compliance department to analyze just how far you can get away with stretching it.


> Isn't that a bit like asking how many pedestrians you can hit and still keep your license?

This is nothing like that. Hitting or not hitting a pedestrian is binary, and there is a clear way to determine fault. We often call car collisions "accidents", but they aren't. Someone did something wrong to cause a collision.

The issue here is that you can be discriminatory "by accident". You can be 100% race blind and have the best intentions in the world and work very hard to be equitable but still have the slightest bias due to the nature of the data.

In fact, when poverty is correlated with race so strongly, I would argue it's impossible for a bank not to accidentally discriminate. A bank isn't going to lend you someone who is unlikely to pay it back - there's nothing racist about that. But they have had a disparate impact.

It's a very sticky situation.


> there is a clear way to determine fault

Yes, but criminal intent is less clear. Someone can be declared at fault for hitting a pedestrian, yet not have been criminally negligent. And even if this happens a few times from really bad luck, it's reasonable that they'd get to keep their license [0].

> You can be 100% race blind and have the best intentions in the world ... bias due to the nature of the data

The question is what data? Are you feeding things in that have a clear causal relationship with your desired result? Or are you inputting everything you can hoping to discover correlations ? The latter is essentially trying to suss out informal groups, and engaging in any group-based discrimination means you cannot claim to have "the best intentions" - regardless of whether the group can be named as a legally protected one or not. If say you're try to base mortgage underwriting on credit card purchase data, it's wholly disingenuous to claim that the resulting fallout is "accidental".

[0] In a society where cars are de facto mandatory to get around. I will disclaim that this casual attitude is a large part of what keeps roads so hostile for everyone else, but it currently is what it is. Also historically we haven't had such a likely hidden factor as phone use.


> So even if the inputs seem legal, an ML model that’s sophisticated enough can derive illegal results that discriminate against certain populations.

That doesn't make any sense.

The old school discrimination (e.g. redlining) worked like this. They would find some factor that correlates strongly with black people (e.g. black neighborhood zip codes), then assign a weight for that factor based on how well it correlated with what they wanted to discriminate against (black people) rather than how well it correlated with what they were supposed to be measuring (creditworthiness).

You can certainly do that on purpose with ML, but the way you do it is to give the algorithm the data on which factors are associated with black people and then ask it to assign blackness scores rather than credit scores and use the blackness scores for making credit approvals. I am not aware of anybody stupid/racist enough to actually be doing that in 2019.

What you're supposed to do is to weight factors based on how well they correlate with the outcome you're trying to predict (e.g. loan repayment) and use those weights rather than the ones chosen purposely to discriminate on the basis of race.

That doesn't mean none of those factors will ever correlate with race. Everything correlates with everything to one degree or another. True independent variables are the exception rather than the rule. But weighting each factor based on how well it correlates with the outcome you're trying to predict rather than how well it correlates with race is maximally non-discriminatory -- doing something else would be purposely giving advantage to one race over another disproportionate to the best available information. And nobody who is not an overt racist has reason to fudge the numbers that way, because it would also make worse predictions and cause you to lose money.


Are there any real-world examples of an ML system causing protected-class discrimination based on non-protected criteria?


A recruiting tool used by Amazon developed a bias against women despite not being told candidates' genders. It penalized candidates who were graduates of all-women's colleges and also those who had the word "women" in their resume (e.g. “women’s chess club captain.”) It had been trained on resumes submitted to Amazon during the previous ten years, so the tool's bias was likely reflective of real human bias in Amazon's recruiting process.

1. https://www.reuters.com/article/us-amazon-com-jobs-automatio...


From my reading of that article, I think the recruiting tool was fed resumés and a data point saying whether or not the corresponding candidate was hired or not. As a result, the tool not only developed a bias against women, but was effectively evidence that there was bias against women in the original hire / not hire decisions.


I just missed the deadline to edit my post, so I am replying to myself.

Looking at the parent comment again, I seemed to have just restated it without adding anything new of my own. I meant to add that my reasoning for why Amazon pulled development of this tool was not just because the tool’s bias, but also because that the existence of the tool and its associated training data could open up Amazon to litigation claiming that their hiring decisions were biased against women in ten year span referred to by the article.


It's interesting that nobody even bothered to check whether the bias was illicit. They found something that sounds bad and the immediate response is "OMG bad PR, pull emergency shutdown."

They just assume that "women's" is coding for female candidates and not something more specific, like gender-segregated activities that may legitimately produce lower quality candidates than the equivalent integrated activities that exposed the student/candidate to a more diverse cohort population. Probably also doesn't help that some of the biggest gender-segregated institutions are penal in nature, i.e. "reform school for troubled girls" or "women's correctional facility."

Did anybody even check whether it also penalizes words like "boys" and "gentlemen's"?


To be fair, it would probably take more time and money irrespective of litigation to be sure that there was illicit bias than to just delete everything and call the exercise a failure.


I think that’s a great example on needing to sanitize data, but being a member of a women’s club or attending a women’s college is itself telling you the candidates gender so I don’t know that this is deriving protected information from non-protected information.


This is a game of whack-a-mole. Given enough data, the ML system will just find other characteristics or groups of characteristics that act as proxies. It might unfairly penalize candidates in ways that are impossible to detect by human evaluators. The promise of these systems is that if you give them a pile of raw data, they will detect subtle patterns that aid in assessing individuals (be they job applicants, ex-cons, whatever.) However, some of the patterns that they detect are our biases against classes of people. If the solution is to sanitize the data to such a degree that the ML system can no longer infer that someone is a woman or that someone is over 40, etc., then the training data is probably also useless for detecting the non-obvious patterns that we want the system to discover.


That it is a “game of whack-a-mole” isn’t clear to me, and you are asserting so without any backing, which is why I asked if anyone has real-world examples.

Citing an example where they don’t provide gender but then provide “went to an all-women college” is not an example of proving non-protected information which results in protected-class discrimination, it is an example of telling an ML system protected information in a roundabout way.


Age is a protected class, and it's extremely difficult to sanitise age from a resume.

Let's say the candidate got a BS in EECS in 1987, they've had 6 jobs since graduating, and their first job was cost-optimising floppy disk drives.

Do you feed the ML system this data, which is clearly correlated with age? Do you keep the number of jobs but delete the duration, responding to 6-jobs-in-3-years the same as 6-jobs-in-30-years? How do you sanitise the age-correlated data out of job descriptions naming old tech or defunct employers? Do you delete all but the last 5 years of their employment history, assigning no value to experience beyond that?


Can you give another example?


Yes, for one example, a system used to recommend sentencing based on how likely a criminal was to re-offend was found to have been heavily biased based on race, despite not being given explicit information about race[1].

Edit: It’s unclear if this system is based on ML, but I think the point stands - if a system can be manually tuned to do this, the risk exists of this happening to a trained model as well. Given other cases of ML models learning an incorrect behavior (such as the system intended to detect skin cancer that ended up being a fancy ruler detector [2]).

[1]: https://www.propublica.org/article/machine-bias-risk-assessm... [2]: https://www.thedailybeast.com/why-doctors-arent-afraid-of-be...


That is a fine answer to mieseratte's question but I feel compelled to mention that the propublica article you link to is...pretty bad.

Here's a pretty good summary of why:

https://www.chrisstucchio.com/blog/2016/propublica_is_lying....


Using zip codes to implement discriminatory policies isn’t ML, but has almost certainly occurred. It’s referred to as redlining, and has been around for a while. The New Deal had some pretty indefensible redlining conditions. A lot of people will say the practice continues to this day, but the modern examples are a lot more open to interpretation than some of the historical ones are.


Was the New Deal redlining out-and-out discrimination or a matter of accidentally creating conditions, i.e. based on bad intelligence, that resulted in discrimination. I always understood it to be the former.


The level of plausible deniability regarding The New Deal is mostly a matter of opinion, but personally I think it was pretty transparent in how it intentionally discriminated.

It’s the modern examples that I think are much more questionable. For example, there are more liquor stores in black neighbourhoods. Is this because:

a) A conspiracy to use alcohol to suppress the black population?

b) Those neighbourhoods have a greater demand for liquor stores?


I mentioned corollaries... there’s no reason zip code or lat long should be an input.


I believe they're referring to Fully Homomorphic Encryption [1]

[1] https://en.wikipedia.org/wiki/Homomorphic_encryption#Fully_h...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: