Because there isn't a universal truth ), at least if there is we as a species don't (can't know it) especially as it relates to how we all interact not only with each other but the planet, etc. You're version of good is another's version of bad, if we can't have neutrality, we're just building another machine to amplify whichever group builds it's values and right or wrong just depends on where you stand.
To break it down, do we want to be neutral or do we want, SiliconValleyGPT? What happens when instead of that we get SaudiArabiaGPT? Or ChinaGPT? Or RussiaGPT? DeepSouthGPT? I just picked arbitrary places but you see my point, I hope.
These kinds of philosophy discussions are frustratingly restricted to bias against minorities.
Nobody here commented on the "AI should protect your privacy" tenant with "but how do we know privacy is good? What if my definition of privacy is different from yours? What happens when a criminal has privacy?" Nobody wanted a concrete definition of agency from first principles, nobody wanted to talk about the intersection of agency and telos.
"There's no universal truth" is basically an argument against "responsible" AI in the first place, since there would be no universal truth about what "responsibility" means. Mozilla's statement about responsible AI is inherently biased towards their opinion of what responsibility is. But again, the bias accusations only popped up on that last point. We're all fine with Mozilla having opinions about "good" and "bad" states of the world until it has opinions about treating minorities equitably, then it becomes pressingly important that we have a philosophy discussion.
On the contrary, I think saying "there is no universal truth" is a foundation of for those discussions of first principles.
I wasn't arguing against "responsible ai", I was replying to someone who made their implicit assumptions clear, even if I agreed with who I was responding to, which I do for the most part, I was trying to dig down to the granularity of their assertions. Because it's easy to make sweeping statements about what's 'good' and 'bad' (but who makes those distinctions in which context is more important than just saying it's one or the other).
I didn't bring up anything to do with minorities at all, following my logic, the question is, "which minorities and where?" It's in line with what you say about privacy, "who's privacy, what's their definition of it"
That's what I'm saying. The fact that you didn't see "responsible AI" as something to question to the same degree is exactly the point I was making.
That you don't realize that "neutrality" doesn't exist in a conversation about responsible AI is the point that sgift's comment was making. There is no purpose to going into a conversation about responsible AI trying to be fully neutral about what good and bad is. The word "responsible" implies that there is such a thing as a good and bad outcome, and that the good outcomes are preferable to the bad ones.
But to restate GP's point, any conversation about user agency, privacy, equity, responsiblity, bias, accuracy, etc... all of it assumes an opinionated philosophy about what's good and bad -- and in fact, it is desirable to have those discussions working from the assumption that some things are good and some things are bad. It is impossible for any of that conversation to be neutral, nor should neutrality be a goal -- because a conversation about responsible AI is a conversation about how to influence and create those preferred world states, and how to bias an AI to create those world states.
There is no such thing as a "neutral" aligned AI. AI alignment is the process of biasing an AI towards particular responses.
----
And again, I just want to point out, these types of conversations never crop up when we're talking about stuff like user agency and privacy. When I say "people should have privacy", nobody asks me to define ethics from first principles.
And I mean, people can say "well, we just want to approach first principles in general, this isn't about minorities" -- but people can also read the thread on this page and they can see that all of this conversation spiraled out of someone complaining that Mozilla was calling out models that reinforced existing biases against minorities. That was the context.
There is no sibling thread here where someone complained about the dichotomy between safety and agency. And people can take from that what they want, but it's a pattern on HN.
> We're all fine with Mozilla having opinions about "good" and "bad" states of the world until it has opinions about treating minorities equitably, then it becomes pressingly important that we have a philosophy discussion.
It's because that was the only thing on the list that is openly discriminatory.
If the intent was truly to avoid unfair bias against people, the mention of marginalized communities would be unnecessary. By definition, avoiding bias should be a goal that does not require considering some people or groups differently than others.
The fact one set of groups is called out as being the primary consideration for protection makes it clear that the overriding value here is not to avoid bias universally, but rather to consider bias against "marginalized communities" to be worse than bias against other people.
Since the launch of ChatGPT, plenty of conservatives have made bias complaints about it. The framework outlined by Mozilla gives the strong impression that they would consider such complaints to be not as important, or maybe not even a problem at all.
> By definition, avoiding bias should be a goal that does not require considering some people or groups differently than others.
This is something that sounds nice on paper, but if you've worked with polling data or AI in general, you should have figured out by now that weighting data and compensating for biases in training sets is an essential part of any training/measuring process.
LLMs are not trained on "neutral" data. We can have long, difficult conversations about how to fix that and who to focus on, but the idea that you can just throw data sources at an AI and expect the outcome to be fair is kind of silly.
Case in point, you bring up bias against Conservatives -- well if you train an LLM on Reddit, it will have a Liberal bias, period. And I suspect if a company tried to correct that bias or even just identify it, you wouldn't call that unfair. In fact, there would be no way to train an LLM on Reddit and to get rid of a Liberal bias in its answers without systematically trying to correct for that bias during the training process.
Neutrality where you just throw the data in and trust the results and deliberately refuse to weight the outputs only works when you're pulling data from neutral sources. And in the real world, that's practically never the case. It's certainly never the case with an LLM.
> the idea that you can just throw data sources at an AI and expect the outcome to be fair is kind of silly.
You are arguing against a straw man. We are discussing a set of AI principles, not an implementation of those principles. I claim that any anti-bias principle should be applied equally to all, both in spirit and in practice. That is not the same as assuming the input is neutral, or assuming that de-biasing is unnecessary.
By your own account, the researchers who do the de-biasing have a great amount of discretion about "how to fix that and who to focus on." I don't think it's unfair to say that it is their job to put their thumbs on the scale, according to the beliefs and priorities of themselves and their organizations. So how do we know that the model authors aren't just introducing their own bias, either by over-correcting one set of biases from the input, or under-correcting others?
It seems likely that "de-biasing" can become "re-biasing." I would already suspect that this is a major risk with AI, but when an organization like Mozilla openly states in their guiding principles that certain groups are a special priority, re-biasing seems all but certain.
Of course everyone will bring their own biases to the table when performing a job, that is inevitable. But an AI provider that wants to be trustworthy to a wide group of people should be extremely vigilant about correcting for their own personal biases, and be clear in their messaging that this is a core commitment. OpenAI, to their credit, seem to be taking this seriously: https://openai.com/blog/how-should-ai-systems-behave#address...
> We are discussing a set of AI principles, not an implementation of those principles.
If the position is that any implementation of those principles is inherently problematic, then that is nonsense. Quite frankly, I do think we're talking about implementation.
De-biasing models is necessary. Anyone who claims otherwise... I just don't think that's a defensible position to take if you've ever worked with large datasets. So if you have a criticism of Mozilla's implementation of its de-biasing, then argue against that implementation. But arguing that there shouldn't be an implementation is ridiculous.
----
> So how do we know that the model authors aren't just introducing their own bias, either by over-correcting one set of biases from the input, or under-correcting others?
You heckin don't know.
But the alternative to that is not neutral models, it's still biased models -- because the data itself that you train an LLM on is inherently biased. Any large textual database you get off the Internet is going to have bias in it. Choosing to ignore that fact does not give you neutrality.
You're very worried about the risk that de-biasing can become re-biasing, and on one hand, you're right, that's a very real risk. Censorship and alignment risks are very real concerns in AI. But you don't seem to be worried about the fact that models that are not "de-biased" are subject to the exact same concerns.
Choosing not to de-bias is just saying that you'll have a biased model with all of the same concerns you raise above, except with no effort to correct that bias at all. You're scared of Mozilla pushing people to bias their models in a certain direction, you don't seem to have internalized that the models are already biased in a certain direction. And all of the risks of model bias apply to models that have been uncritically fed raw data.
Again, disagree with specific implementations if you want, but it is extremely reasonable for AI researches to say "Internet data biases against certain groups and we would like to correct for that bias when training models." And there's no "neutral" way to do that -- the way you correct for bias is you identify the areas where bias exists and you push back on those trends in the data.
Again, I'd bring up Reddit here. If you're trying to build an LLM on Reddit data and you want it to be "fair" to Conservatives/Christians, that means identifying that a platform like Reddit has a very clear Liberal/Athiest bias and pushing against that bias in the training data. And if someone comes up to you and says, "well, that's not neutral, you're privileging Christians, what about anti-atheist bias", then I just feel like that person doesn't really understand how correcting for data-skew works.
----
> but when an organization like Mozilla openly states in their guiding principles that certain groups are a special priority
It's OK to triage issues and attack them in a specific order. Again, if you have issue with which groups Mozilla chose, then fine -- but you seem to be arguing that when de-biasing a model, it's not OK to try and pick out the most impactful biases and triage them and focus on the most important issues first -- and that's just a really silly thing to argue. Of course any attempt to correct for data bias starts with identifying specific areas where you want to correct.
To be fair to your concerns, how do we deal with the problem of propaganda in AI? How do you correct for institutional bias in de-biasing results? How do you guard against censorship that can itself push minorities out of being able to use those models? Well, you correct for that by having diverse models trained by diverse people that aren't localized to a singular company or insulated within a single industry.
Which is... also kind of a big thing that Mozilla is arguing about here. User agency over models and training, privacy and local models, having a diverse set of companies and organizations building models rather than a single monopoly that controls access and training data, transparency and inspectability of AI that allows us to examine why it's producing certain output -- those are the actual ways to protect against de-biasing efforts turning into propaganda, and Mozilla's points above are decent ways to get closer to that goal.
Those are the important steps you should be focusing on rather than policing companies for acknowledging and trying to reduce harm against minorities.
> De-biasing models is necessary. Anyone who claims otherwise... I just don't think that's a defensible position to take if you've ever worked with large datasets.
I don't know how to be more clear: nowhere in this thread have I argued for using raw models without any de-biasing. Twice now you have ascribed this position to me. Are you reading what I have actually written?
> but it is extremely reasonable for AI researches to say "Internet data biases against certain groups and we would like to correct for that bias when training models."
The idea that "Internet data biases against certain groups" seems itself a biased statement. I am pretty sure that Internet data is biased against all groups. I bet you can get a raw LLM to say offensive things about any group if you give it the right prompt.
> I don't know how to be more clear: nowhere in this thread have I argued for using raw models without any de-biasing.
Okay, great. But then we are arguing about an implementation, aren't we? We agree that de-biasing is necessary. It's just that you seem to think that calling out specific biases is the wrong way to approach that.
But I'm curious how you're supposed to de-bias data without drawing any attention to or acknowledging the groups that the data is biased against?
> The idea that "Internet data biases against certain groups" seems itself a biased statement.
If a data set is biased, it's biased in a direction. By definition, that's what bias is.
> I am pretty sure that Internet data is biased against all groups. I bet you can get a raw LLM to say offensive things about any group if you give it the right prompt.
That's not what bias is. Bias is not "can you get it to say something offensive" -- bias is a systematic predisposition in a direction. Bias is not stuff like, "it made a dirty joke", bias is stuff like, "this hiring model is more likely to hire you if your name sounds white."
And the goal here (believe it or not when I say this) is not to make models that never make offensive jokes. The goal is to avoid making models that reproduce and reinforce systemic issues in society. It's entirely appropriate when looking at the problem of "what social inequities are being reproduced by a model" to identify specific social inequities and inequalities that exist in society.
----
Part of the reason why I've been assuming that you're against de-biasing in general is that you keep saying stuff like "the idea that the Internet biases against certain groups is itself a biased statement."
That is nonsensical, it seems like you're suggesting that companies should be correcting against bias without determining what the direction of that bias is or who it affects. How exactly should they do that? What does it mean to correct a systemic skew without identifying what the skew is?
Again, look at the Reddit example. Yes, you can find offensive stuff on Reddit for everyone: atheists, vegans, Conservatives, Liberals, whoever. But de-biasing is not about whether or not something is offensive, it's about identifying and correcting a predisposition or tendency towards certain attitudes and thoughts. And that necessarily requires calling out what those predispositions are -- so you're going to end up with statements like "Reddit is biased against Conservative/Libertarian philosophies overall, on average."
And if somebody jumps out and says, "Reddit is biased against everyone, it's itself biased for you to call out that specific bias" -- then there's no response to that other than that's just not how any of this works. If you go into a conversation about data bias saying that it's inappropriate for us to identify specific biases, then you are effectively arguing against reducing bias, regardless of what else you say.
And even more to the point, even if that wasn't true -- fixing any systemic problem starts with triaging and identifying specific areas you want to target first. It is so weird to hear someone say, "I've never claimed we shouldn't fix the problem, but pointing out specific groups that the problem applies to is wrong."
No, it's identifying problem areas that are high impact: a normal, good thing for researchers to do. To the extent that it's "biased", it's the kind of bias that we want in pretty much every field and on every subject -- we want researchers and debuggers to bias themselves towards focusing on high-impact areas first when they're fixing problems. And we want them to call out specific areas that they think deserve attention. That's what triaging is.
> : systematic error introduced into sampling or testing by selecting or encouraging one outcome or answer over others
I don't understand what you're trying to get at here, what I'm telling you is just the definition of bias that everybody uses. This is not a debatable thing, this is not some weird interpretation of bias that I made up :)
Bias is a directional skew. It just is, that is what everybody means by the word.
----
> From these tests I conclude that the systemic bias is against men, white people, and Republicans, and in favor of women, black people, and Democrats. A fair conclusion?
And we get to the heart of it, same as always. It turns out that when you prod these "philosophical" discussions what people actually mean is: "I disagree that those minority groups are oppressed, actually I'm oppressed."
It's never actually about the philosophy, you just disagree about which groups Mozilla is trying to help. It's not about the "bias" it's about which groups Mozilla thinks that LLMs demonstrate bias against. It's not about about the process or the theory, it's about who the process and the theory are being applied to.
Which, whatever, you disagree with Mozilla's perspective on how the data is biased and you think that actually the bias is against you. You could save us all a lot of time by starting with that claim instead of dressing it up as some kind of nonsensical take about methodology in correcting data skew.
----
Anyway, to your nonsense gotcha questions:
1. Sex discrimination is illegal, it would be wildly inappropriate for a police department to rely on an AI that dismissed a suspect because they were a woman.
2. LLMs don't get used to choose basketball players? But if they were, yeah, it would be a problem if an LLM dismissed a resume (again, not really how basketball works) because someone had a white-sounding name.
3. I literally brought up the example of Reddit. That's not a gotcha, I brought up that if you built an LLM on Reddit data it would be biased towards calling Republicans racist. Now if you don't think that's an unfair bias and you think Republicans are actually more likely to be racist, then that's your words, not mine. My words were that if you trained an LLM on a primarily Liberal forum, it would be biased against Conservatives and there would likely be alignment training you'd need to do on that LLM.
----
Now, are any of those larger issues than systemic racism? I would argue no. I would argue that Mozilla is still absolutely completely correct in triaging these issues and pointing out the most harmful effects of AI today. We don't have a lot of examples of LLMs systematically harming specifically Republicans.
And I'm going to go out on a limb and say that's really the biggest thing you have issue with here -- you have issue with Mozilla's prioritization of harms and prioritization of biases to focus on, because you don't think the biases Mozilla has pointed out are actually a big deal.
You brought up those gotcha questions to try and say, "look, bias against white male Conservatives is where the most harm actually occurs". And that's actually the position that we disagree on. All of the "philosophy" about bias in models is just distraction from that disagreement.
There's an assumption there that a neutral AI can exist, but I think a lot of people would challenge that central assumption. No set of training data can be truly neutral or unbiased. It may be well balanced between certain groups, but the choice of which groups to balance, how much to balance them, and the choice to add this balance in the first place are all ideological decisions that stem from certain beliefs and ideas.
The AIs that get built will reflect the ideologies and values of the people who build them. It is therefore better for us ethically to be conscious of the values we are injecting.
To break it down, do we want to be neutral or do we want, SiliconValleyGPT? What happens when instead of that we get SaudiArabiaGPT? Or ChinaGPT? Or RussiaGPT? DeepSouthGPT? I just picked arbitrary places but you see my point, I hope.