>I do still think that the men/women bias and the Democrat/Republican bias both make more sense as originating in moderators favoring one group over the other, since none of these are typically used as an insult by themselves.
They may not be used as insults themselves, but they are used more in hate speech.
Republicans are generally more opposed to the idea of "hate speech" as a category of speech and are therefore less likely to identify any speech as hate speech. Democrats have embraced it more as a concept, are more likely to label something as hate speech, and are more likely to think that type of speech is bad. It therefore seems likely that Democrats would use what a neutral observer would categorize as hate speech less than Republicans due to self-censorship. That would result in the word "Republicans" appearing in hate speech less often than "Democrats" because the hate speech infused insults will be targeting the opposite party.
How am I blaming the victim? I am simply pointing out that language can indicate bias without necessarily being biased itself.
I am guessing that if we applied a similar model to Russian and English, the model would indicate there is an inherent bias against the west in Russian and a bias against Russia in English. That is all were seeing here. It isn't actually indicating anything about the language. It is telling us about who uses the language and how they use it.
Words, phrases, and linguistic approaches that are generally coded as conservative will be more likely to denigrate liberals and vice versa. Conservative speech will be more likely to be flagged for hate speech because conservatives by and large care less about being PC. It is important to reiterate that does not mean conservatives are necessarily any more racist. Their speech just correlates more with racists speech because less effort is put into avoiding that correlation.
> I am guessing that if we applied a similar model to Russian and English, the model would indicate there is an inherent bias against the west in Russian and a bias against Russia in English. That is all were seeing here. It isn't actually indicating anything about the language. It is telling us about who uses the language and how they use it.
So if your argument is that the model has been trained on a collection of information about what is hate speech assembled by liberals, then I can see it might be possible.
But if your argument is that (speculatively) republicans engage more in hate speech, then bad things said about republicans is not detected as hate speech by the model, the jump is rather far.
>So if your argument is that the model has been trained on a collection of information about what is hate speech assembled by liberals, then I can see it might be possible.
It doesn't specifically need to be "assembled by liberals" to have a liberal bias. Liberal people are more likely to categorize anything as hate speech than conservatives. Liberals think being PC is important. Conservatives are generally dismissive of being PC. Even if there is no inherent bias in the makeup of this hypothetical review panel, the panel will result in ruling that are more in line with liberal thought because conservatives are less likely to take an active lead in labeling hate speech.
>But if your argument is that (speculatively) republicans engage more in hate speech, then bad things said about republicans is not detected as hate speech by the model, the jump is rather far.
My argument is that these systems can't actually identify hate speech. The question isn't whether Republicans engage in hate speech more frequently. They likely engage in speech that resembles hate speech more frequently because they don't care about being PC.
Usage of the term Latinx is an example. I have heard valid arguments why people should are shouldn't use that term, however its usage is currently much more common in liberal circles. Therefore a phrase using "latinx" instead of "latino" is going to be less correlated with hate speech because racists just aren't using "latinx".
They may not be used as insults themselves, but they are used more in hate speech.
Republicans are generally more opposed to the idea of "hate speech" as a category of speech and are therefore less likely to identify any speech as hate speech. Democrats have embraced it more as a concept, are more likely to label something as hate speech, and are more likely to think that type of speech is bad. It therefore seems likely that Democrats would use what a neutral observer would categorize as hate speech less than Republicans due to self-censorship. That would result in the word "Republicans" appearing in hate speech less often than "Democrats" because the hate speech infused insults will be targeting the opposite party.