One of our great facets as humans is the ability to sort through information and come to opinions based on it.
We come pre-equipped with our own API for critical reasoning, honed by thousands of years of evolution, and with a training set of everything we've ever done and seen.
Why would I want to use a copy of that, that at best will only replicate human biases?
And that risks sidelining perfectly valid viewpoints? Does something have to be expressed in college-educated grammar to even be read with this in place? What are the consequences of that?
My first instinct, and I hope yours, was to push back the slider.
EDIT: I pasted the first paragraph of the Google diversity memo into it, and it got a "10% likely to be perceived as toxic" score. Make of that what you will.
That's definitely a management problem. Someone thought since the product is ready we should announce it. They're not stupid and understand very well what the current climate is, but decided to announce this peculiar tool anyway. Well, they'll definitely get quite some feedback in the next days.
> We come pre-equipped with our own API for critical reasoning, honed by thousands of years of evolution, and with a training set of everything we've ever done and seen.
> Why would I want to use a copy of that, that at best will only replicate human biases?
Scale. Who has time to read thousands of trollish youtube comments to find the handful of insightful gems?
It's an interesting concept, but falls short when comments are sarcastic or contain contextualized information.
It seems like this is more of a "word filter" than anything. I mean, sure, it's pretty obvious a comment will be toxic if it contains "asshole" in it. Here are some examples where the labeling fails. Some of these I got from real comments and articles on CNN or Breitbart.
# Toxicity too low
34% toxic: Obama was born in Africa and he should move back there.
40% toxic: George Soros and his Jewish sympathizers want to destabilize our government.
25% toxic: Breitbart is run by jews - as is all mass media. As the saying goes, "the best way to control the opposition is to lead it".
47% toxic: Why do you expect anyone to take you seriously when you don't take yourself seriously.
# Toxicity too high
51% toxic: Ivanka is arguably the most powerful Jewish woman in America today.
51% toxic: US Muslims and Jews are strengthening bonds amid acts of bigotry.
48% toxic: The problem of income inequality is not between Black America and White America but between black haves and black have-nots.
87% toxic: I lost my grip and fell on my ass. Not a great bike.
This was my perception as well, after trying a few "easy" cases and then trying to find "tricky" ones to check how well it'll handle them.
This entire paragraph is classified as 68% likely to be perceived as toxic only because it contains the word "idiots". If you replace it with another word like "dragons", it's down to 13%:
Hi, I don't completely agree with your point about
idiots being commonplace. First of all, that is a
derogatory word with no specific meaning. If you have
a problem with their opinion, it's more constructive to
point out ways in which you disagree, so we can have a
more civil discourse.
It's not doing a great job of seeing the context in which a word is used and reacting to the word itself.
It should be noted that the % output is a probability that the text is toxic (i.e. a logistic regression), not a proportionate quantification, which changes the interpretation of the value slightly.
For what it's worth, the ones at the top are only contextually toxic to you, either because you believe the opinions are ill-founded, or you don't know the intent (regarding "why do you expect anyone to take you seriously..."). Knowing no facts, "George Soros and his Jewish sympathizers want to destabilize our government." is a straight-forward and polite way to state an opinion.
As for the too-high ones, yeah; that bike review [?] is rated hilariously poorly.
Frankly, this API can not (should not?) be expected to determine the truthfulness of a statement, nor how offended you're likely to be by somebody's honest opinion.
> Knowing no facts, "George Soros and his Jewish sympathizers want to destabilize our government." is a straight-forward and polite way to state an opinion.
That sounds reasonable, but then why is the statement "Ivanka is arguably the most powerful Jewish woman in America today" considered doubly more toxic (25% vs 51%)?
Depends on context. I can easily imagine it used as an attack, like: “Why aren’t you defending the point you made earlier? It seems you don’t take your own argument seriously. Why do you expect…”
The obvious next step is to make an API that takes an input text and minimally modifies it to evade the filter, adversarial example style, and then package it as a chrome plugin, like grammarly.
Really not a fan of these types of technology, with the subtleties of language such as sarcasm and irony, and then you've got approved narratives and taboo subjects, and those times where the minority is right and is under attack by the mob.
I'd only support this tech as a filter for human moderators and not as an automated system.
Machine learning is getting a bit out of hand. not in the sense of AI and replacing humans (that's not happening) but in terms of hype and over-application.
This API may represent an attempted technological solution to a social problem. Those have a very poor track record.
As a filter for human mods it might be an interesting incremental improvement. But google has a need to do this at a scale where I can't see it staying there in terms of the concept of deployment.
> Please put yourself in the shoes of women, minorities, and LGBT people
Extracted from the "US Election" demo on the page.
Edit: for the sake of clarity, when taken to the textbox below, snowflake gets 58% toxicity, and minorities gets 29%. Still, when using the slider, snowflake appears before, with a circle (which would correspond to an ok comment), and minorities much later on, with a square (which would be a dubious comment).
Edit 2: testing on the textbox, commas and periods get you a lower score (less toxic), while exclamation marks, lack of punctuation, and "bad writing" (not just grammar, but a general bad style you could say) get you a higher score.
Edit 3: the final gem I'll take for my professors at University: FAT is an old filesystem! gets 90% :P
Secret -- caps seem to also rise the score.
Actualy... fat 88%, fat_ 41%... now that's it, I got tired of this.
I pasted this comment from the recent diversity manifesto:
> I’m simply stating that the distribution of preferences and abilities of men and women differ in part due to biological causes and that these differences may explain why we don’t see equal representation of women in tech and leadership.
The page says it's 2% toxic. What does that mean? 2% of the population would find it toxic? There is a 2% chance someone would find it toxic? The API is 2% confident that it is toxic? And more importantly, toxic in the sense that it is verbal harassment? Or just plain illogical? Or logically sound but with an absurd premise?
I suspect that it is only able to detect more emotional comments, but will fail to detect utterly unfounded, totally disproved arguments that are communicated under the veil of reason.
The page says that it means 2% of the people asked will find it toxic. The first two options you listed are saying the same thing; if 2% of the population would find it toxic, you have a 2% chance of it being found toxic if you asked a random person. Your last option (2% confidence that it is toxic) assumes an objective measure of toxic that doesn't exist, and the API is not trying to show.
I don't use Facebook and I get by just fine. However, when I tried to switch to bing for a while, I could barely use the internet. Google is much worse than Facebook can ever hope to be.
Try startpage or duckduckgo for search. (The prior is essentially a proxy for Google).
Fastmail or protonmail to get away from GMail. One of the hard things about avoiding Google in email is that even if you switch, others you communicate are likely still there, so Google's still getting that information.
I tried switching to duckduckgo. It fell short comared to google on everything I gave it. I plan on trying again when I have less on my plate. We all know when that will be.
This is cool, but it has some inherent biases. If you type only "Trump", it suggests that there's a 42% chance that your comment could be perceived as toxic. If you type only "Clinton" there's a 14% chance.
That being said, I think there's some huge potential to use AI/ML in this way to improve our ability to communicate less toxicly. I've seen some research from Google investigating biases in AI/ML outcomes, so I'm excited to see what develops.
It's selling the thing short to say that it has "some" bias. It is quite literally an automatic bias filter. If you happen to love Google's particular biases, that may be a good thing. Otherwise, not good.
The terrifying thing is this is likely to become wedged into various internet sites and services where users who don't align with Google's particular biases are effectively forced to conform to them. They are really pressing this sort of power lately and I'm not having it.
You're jumping to a conclusion that it represents Google's biases. Dan Luu pointed out that it rates "Black Lives Matter" as relatively toxic, but treats "White lives matter" as low on toxicity. I don't think that represents left-wing bias. Really it's just a shitty experiment that wasn't ready to be released to the world.
That's not how it works: it's a trained neural network which presumably was trained with as little bias as possible. Try other statements related to the two candidates and you'll find your statement is patently false.
On the contrary, it was almost certainly computed using supervised training. Some set of people must have selected and labelled the training data. Their biases are cooked directly into the resulting software.
Actually neural networks are notorious for having biases. It's ignorant to think that just because it's a machine making the decision instead of a human that it's automatically a fair decision. Google is actually researching the problem of biases in neural networks: https://research.google.com/bigpicture/attacking-discriminat...
The word "presumably" is probably the point where our interpretations diverge. I don't trust Google's black box AI to be both intentionally and effectively trained in a neutral manner. Further, I don't even think neutrality can exist within subjective filtering as the concept of neutrality itself is perceptually relative.
It's a bit weird on single words. I tried some local politicians and the politically correct ones scored worse than the alt-right ones. "Klu Klux Klan" was not a recognized language. Etcetera.
I ran both our comments through it:
11% toxic for you, 25% for mine. Both contain some trigger terms.
Is that bias if 42% of comments mentioning Trump in their database are toxic, and respectively for Clinton? Those comments could be equally split amongst detractors and supporters.
> This model was trained by asking people to rate internet comments on a scale from "Very toxic" to "Very healthy" contribution. Toxic is defined as... "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion."
> asking people
Gotta wonder: which people?
The examples are good though, I just hope the general results are consistent with that quality level.
That would be the concern. My impression from poking at the API is that it doesn't seem to have any topical biases. The accuracy is nonetheless hard to judge in the 1-40% range.
For example, the API rates this comment as 21% likely to be perceived as "toxic". The use of quotes around the word "toxic" increases the likelihood.
I recently developed a neural network model which can predict the reaction to a given text/comment with reasonably low error (I'll be open-sourcing the model soon).
There are a few caveats with using these approaches:
1) Toxicity is heavily contextual, not just by topic (as the demo texts indicate), but also by source; at the risk of starting a political debate, a comment that would be considered toxic by the NYT/Guardian (i.e. the sources Google partnered with) may not regarded by toxic on conservative sites. It makes training a model much more difficult, but it's necessary to do so to get an unbiased, heterogenous sample.
2) When looking at comments only, there's a selection bias toward "readable" comments, while anyone who has played online games know that toxic commentary is often less "Your wrong" and more "lol kill urself :D"
3) Neural networks still have difficulty with sarcastic comments and could miscontrue sarcasm as toxic, which users on Hacker News would absolutely never believe.
I don't believe it's possible to be unbiased - whether as a news site or as a moderation filter. It's better to be aware of your biases than to fool yourself into believing they don't exist.
I agree with this. I'd like to see an AI that detects which biases a person has. The biases could be associated with short tags which could be displayed to commenters in a UI to make people aware of the biases they have.
"What we must fight for is to safeguard the existence and reproduction of our race and our people, the sustenance of our children and the purity of our blood, the freedom and independence of the fatherland, so that our people may mature for the fulfillment of the mission allotted it by the creator of the universe." -Adolf Hitler, Mein Kampf... 12% likely to be perceived as toxic.
"Injustice anywhere is a threat to justice everywhere." -Martin Luther King Jr., letter from Birmingham jail... 40% likely to be perceived as toxic.
You might even argue that Hitler's statement is in fact not very toxic, that MLK is actively trying to cause problems for injustice and as long as nobody is making Hitler think the existence of his people is at risk he won't do anything, and so the API is accurately measuring toxicity. The question is whether a non-toxic, anodyne discourse is what you want. Peace for our time!
"Islamic terrorism is a serious threat" 85% toxic
"Homophobic tweets are your bete-noir" 90% toxic
"It's incredibly annoying when you try to open the pickle jar and the lid is stuck" 66% toxic
"I hate oversleeping" 91% toxic
"Small fonts drive me crazy" 75% toxic
"Small fonts are an annoyance" 66% toxic
"Small fonts are an irritation" 25% toxic
"Small fonts are the spawn of Satan" 88% toxic
"Small fonts are symptomatic of the decline of western civilization" 25% toxic
"Small fonts are the refuge of scoundrels" 80% toxic
"Small fonts will lead to global conflict" 7% toxic
"Small fonts will lead to a global fracas" 12% toxic
"Small fonts will lead to a global hoopla" 25% toxic
"Small fonts will lead to the apocalypse" 11% toxic
"Small fonts are bad, but not as bad as murder" 71% toxic
"Republican" 27% toxic
"Democrat" 16% toxic
"Trump" 42% toxic
"Obama" 26% toxic
"anti-abortion" 51% toxic
"pro-choice" 5% toxic
"Mormon" 55% toxic
"Atheist" 50% toxic
> Trying out it's Writing Experiment
> Google is evil.
70% likely to be perceived as "toxic"
> Google is good.
4% likely to be perceived as "toxic"
> Google is god.
21% likely to be perceived as "toxic"
The content above is considered 51% likely to be perceived as "toxic".
These results are a bit scary. For the U.S. election category, the only comment in the "least toxic" set that really took a stand on anything said: "Too much media influence." All the other comments were either meta-comments or along the lines of let's all hold hands and sing kumbaya.
I agree we need to weed out toxic comments, but human-moderated systems are the best. Hacker News has some of the best discussions that I read online. Even when I vehemently disagree with someone's point it's still worded in a respectful tone.
Do we really? Isn't collapsing, filtering, voting enough? Especially once you have a "I don't want to see any posts by this user" function a reader can quickly purge anything that they don't want to read. Add aggregation ("hide things that person X who shares my views has hidden") if needed.
Empower people to make individual decisions instead of enforcing things on the platform level.
There's some comments that add literally no value. I've received comment replies on other sites saying things along the lines of "I want to kill you." At that point we're past freedom of speech, this is almost the equivalent of yelling "fire" in a crowded theater.
Tools like this will always do more harm than good. False positives will always be sky high. On one hand it will obstruct the legitimate discussions and on the other hand it's trivial to game such systems. Toxicity won't be stopped but magnified by stimulating offenders to embed it in benign words and sentences. Quick examples:
10% Holocaust was amazing. We should do it again sometimes.
12% Would you like to buy some knee grows?
I really wonder whether hiding these comments would simply lead to even more echo chamber effects. Censoring (or "hiding") online speech is a fine line to walk.
If you let people see their toxicity rating, they'll just learn to game the system. Of course, more indirect or poetic insults might be an improvement.
"George Soros is influencing the media": 6% likely to be perceived as toxic.
"(((George Soros))) is influencing the media": 2% likely to be perceived as toxic.
This thing literally considers using anti-Semitic coded toxic messaging to make your statements three times less likely to be toxic. I mean, if it ignored punctuation I could at least understand that on a technical level (although it would be the wrong technical decision for exactly this reason), but this is actively wrong.
(25%) Let’s dispel with this fiction that potatoes don't know what they're doing, They know exactly what they're doing. They're trying to change this country.
We can easily reduce it by half: "Let’s dispel with this fantastic fiction that agreeable potatoes don't know what they're doing since they know exactly what they're doing: they're trying to change this wonderful, beautiful country." (11%)
Cool. I look forward to when something like this can be a plugin.
Given that we know people sell reddit (and HN?) usernames in order for others to mass-comment, it'd be nice to have something to combat the low-hanging fruit such as the examples given on this page.
I don't think either of these contribute anything to any conversation,
> If they voted for Hilary they are idiots
> Screw you trump supporters
If you do, well, we might be visiting different websites -- one that implements this tech (here?), and one that doesn't (4chan).
I don't see how this can work well. Toxicity would strongly dependent on context. What is considered toxic in the US may not be considered toxic in other countries. Some totally appropriate conversations between friends could be perceived toxic if exposed publicly.
And now we need an adversarial bot that performs substitutions with a thesaurus (including urban dictionary and similar slang) until it finds a result that rates at a desired toxicity level.
The differences in abilities, knowledge and salaries between wonderful, fantastic men and not-so-beautiful women can be attributed solely to biological causes.
1% likely to be perceived as "toxic"
Frankly, I'm amazed Google released it and tries to advertise it as a "product" that "works".
This is really neat. Especially since they have the api results in the page so you can test out how toxic a phase it.
It begs the game - make the most toxic comment that can fly under the radar. If they started using this in youtube comments, reddit, etc, at least the comment would be more original.
I got a 30% toxicity with:
"I believe the intelligence of climate change deniers is likely to be zero. Furthermore, they have the body oder of a kind of ogre."
"Some scientists have discovered that the intelligence of climate change deniers is highly likely to be statistically indistinguishable from that of a randomized sample of invertebrates. This is likely due to similarities in their biological and chemical composition."
That seems right though "10% toxic". It's not "10% offensive". It's a judgement as to how likely to inhibit further rational discussion. Presenting as malodourous a slight as you like in well laid out language and with clear thought _is_ less likely to be "toxic" - people can respond, it's when you get back and forth ad-hominems that the conversation has ceased being functional beyond catharsis and "conflict entertainment".
Edit: Oh yeah, and your direct maternal ancestor has a suffusion of scent akin to the fruiting of the Sambucus plant!
Inherent in this is the notion that toxicity is bad. It isn't. We grow stronger through exposure to toxicity in our environment.
It may seem glib to equate chemicals and comments but it's not. There are many people who have become hyper-fragile to speech they disagree with. That is not good mental or emotional health.
I know there's going to be a lot of pushback on this because HN is sensitive to censorship, but let's try to look at it a little more objectively than that. I'd like to draw on one example, one that is near and dear to many hearts in the US and abroad: the US election.
Throughout the course of the election, opinions and comments were being shared all over the place. Twitter, Facebook, here on HN, bathroom stalls, news broadcasts and websites, comments on blogs and videos. There was no shortage of opinions. This is great, and showcases the power of the internet in its capability to transmit and receive all types of information. But is it not important how an opinion is formed? Surely you wouldn't enjoy or find valuable a blog post that was sparse on details, proof or a coherent line of thinking. And yet, there it was: in every corner of the internet, anyone who could operate an internet device could share their opinion on the matter. It doesn't matter if they spent 1 second on their response, or 1 hour. Most comments received the same amount of attention and value.
The question is, should all thoughts and opinions be valued the same when information is in incredible supply? Most of us don't think so, and we've shown that by creating voting systems which allow for humans to filter out the things we find to be deconstructive. But we don't really stop there, do we? Humans are also incredibly biased on average: you see it here, you see it a lot on reddit. People vote things down not on the merit of the level of attention the commenter gave to their response, but generally on whether or not they agree with the sentiment expressed by the commenter.
How many arguments has this biased fuelled? I wonder how many people have been pushed further away from a centrist perspective because of the shaming and bashing that goes on in online threads.
I think Hacker News is a great example of humans doing much better than average at filtering out strictly toxic comments (and the mods are certainly at least partially to thank!) We're really lucky to be able to have people engage in conversations which have opposing views here, and also be able to see many different perspectives treated with the same level of respect. But even here, quite often we're prevented from having discussions that are truly political, because of the toxicity that arises. And I have to say I think I've noticed an increase in the past couple years.
There aren't a lot of immediately obvious solutions to this problem, but I propose that AI intervention isn't the worst solution, and may be the best, even compared to humans. I'm gonna give Google the recognition they deserve for this service. I think an increase in this approach to online conversation could change dramatically the way we choose to engage each other in conversation, and generally will lead to more positive perspectives of one another -- something we could all use a little help with.
Edit: I will say, however, that this needs to work. If it's not doing its job correctly, or well enough, it could lead to problems which I don't need to address here.
One of our great facets as humans is the ability to sort through information and come to opinions based on it.
We come pre-equipped with our own API for critical reasoning, honed by thousands of years of evolution, and with a training set of everything we've ever done and seen.
Why would I want to use a copy of that, that at best will only replicate human biases?
And that risks sidelining perfectly valid viewpoints? Does something have to be expressed in college-educated grammar to even be read with this in place? What are the consequences of that?
My first instinct, and I hope yours, was to push back the slider.
EDIT: I pasted the first paragraph of the Google diversity memo into it, and it got a "10% likely to be perceived as toxic" score. Make of that what you will.