Hacker News new | past | comments | ask | show | jobs | submit login

MKBHD just released a video where he used this and was happy with the results but hated that it was even needed.

https://youtu.be/1Cw-vODp-8Y




Linus (Tech Tips) in February:

> YouTube’s spam problem has gotten out of hand… And it’s up to the community once again to do what YouTube can’t. Thankfully, the community has delivered…

* https://www.youtube.com/watch?v=zo_uoFI1WXM


Wow, the tool found that 30% of his video's comments were spam.


Which part of that is surprising to you? That 30% of YT comments are spam, or that the software worked to that degree?


Not the OP, but I am personally surprised by many things.

- YouTube's comment system is horrible at dealing with spam - A tool made by a random person online is able to find and remove so many of these comments, meanwhile YouTube with WAAAAYYYY more resources, data, people, etc... is somehow not - The fact that the spam problem is so bad that tech YouTubers are seeing tools like this remove around 30% of the comments on their videos is insane.

Overall I barely even look at the comment section on YouTube anymore because of how big of a mess it is. It's refreshing to look at the comments on Linus Tech Tip videos as they are free from most spam due to them making use of this tool. It's super surprising how long this problem has been going on, and just how much it keeps worsening. It just seems crazy how long Google has been letting this problem go unchecked.


It works precisely because it is a tool made by a random person, so spammers are not trying to avoid it.

Such is the curse of spam: they get approximately infinite tries to post spam. We don’t really know how many spam comments Google is preventing from being posted, but any anti-spam measure they introduce will quickly result in the spammers changing tactics, probably within minutes.


Yet some of the tactics of spammers/scammers are so obvious that it's surprising YouTube doesn't provide simple solutions for some of the case.

For example a common example as MKBHD mentioned is scammers impersonating channel owners within their comments.

Why can't content creators set an option to auto-flag other users that use their name and profile picture (to some degree of similarity) in comments under their video?


Most of the spam left over is super-obvious. It should not be hard to remove it. A filter based on Bayesian probability would wipe out most of them with a high degree of certainty (i.e. very few false positives).


> YouTube with WAAAAYYYY more resources, data, people

Alphabet, Google and to a lesser extent, YouTube, have a lot of ressources, but they might not invest it in this direction. A friend worked on a team responsible for understanding creators, recently, and it was alarming how few people were involved. He left because it was poorly managed. They barely scratched the surface on engagement, not even able to measure the lack of consistency of audience engagement (the biggest point of contention of creators).

They likely have fewer people dedidated to fighting spam overall, let alone in comments, than you’d expect. This is a delicate interaction to highlight because it’s both account creation (accounts that don’t upload videos, so they likely aren’t prioritised) and editing their identity, something no one cares much at YouTube (you get plates if you matter, that’s operations); all the bad things happen off YouTube…

The other comment about adversarial interactions with spam is key --and likely more important-- but we need to remind everyone (including people on HN who likely have experienced this at their job) that “it’s a large company” doesn’t translate to “they have large teams working on this” but “they have many other problems prioritised over this”.


> “it’s a large company” doesn’t translate to “they have large teams working on this” but “they have many other problems prioritised over this”.

I don't think anyone needs to be reminded of that. "Company X with way more resources is failing at basic tasks" doesn't signal a misunderstanding of what those resources are used for, it is criticism of that company's priorities.

The fact is that a company whose profits grow by billions every year is not fixing a very obvious problem on one of their biggest platforms. They have more than enough the resources to fix it.


Their profits would only be relevant if they underpaid the staff that could fix that problem (machine-learning specialists, because you can’t imagine addressing this by hand). They don’t. Whatever lack of ressource they might have isn’t fixable with money alone.

They prioritise other issues; comments, more so, comments on comments, are nowhere near the core feature of the platform. They have, seemingly, addressed issues on copyright infringement, people gaming the recommendation system and first-level comments—things that the same creators have been loudly complaining about earlier. That sound like a reasonnable prioritisation system. The phenomena that has been described is apparently new, so the issue is presumably that they aren’t able to react to new threats rapidly. That would be a new structural concern for them, and not something that they would be expected to have fixed as a large company.


One man's spam is another man's engagement metrics.


Google is spending its resources in more important things like spying on everything you do! They ain't got no time for despamming!


There's also anniversary doodles.


What, you want Google's salaried visual designers (that they would have even with different priorities) to try their hand at solving YouTube spam-filter scaling?


There's coding to them too.


Leetcode caused all this!!!


Not really your point but probably way higher than 30% of comments are spam. These are just the comments that made it through the filter. I wouldn't be surprised if 90% or 95% of comments are spam but most of them get filtered out by Google.


This argument seems circular - assuming that the 30% results have good precision and recall




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: