I entirely disagree. You extract the content, and diff it against other extracte...

I entirely disagree. You extract the content, and diff it against other extracted content. You dont need a human eye to determine that NBC11 republished an AP/Reuters story word for word. This should be somewhat basic pattern matching. We are talking about a company that can brute force vanity onion addresses!

Semantic analysis should be able to detect procedurally generated content farms. Human turking might be harder to detect, but once a site gets flagged, all its posts can be checked with stricter scrutiny.

It is extremely easy (from a computational standpoint) to rip a youtube embed out of a page and directly link to the source. If mashable and yournaturaldietnews are known content embedders, more aggressively deconstruct their pages.

There also has to be a way to crowd source content verification. They have 1.8 billion people, a subset of those people give good feedback. Maybe some peoples reports should be weighed more heavily than others, if they have a history of making valuable reports.

Maybe youre thinking fake news is mostly political. A lot of it is DIY and Health Tips, tech lifehacks etc. Rehosted videos with a banner added on the top and bottom. Animals. Any content taken from elsewhere can be easily detected, similar to Tineye or reverse image searching.