What exactly is being "monetized" when a search result is displayed for a news article that will bring the users to news site where they'll earn ad money to the news outlet?
The news outlet can use robots.txt to prevent indexing. If Google doesn't bring them value, there's the easy answer.
>but that's not what this is about, this is about charging for linking to news articles.
Sure, but these things are directly related. It's the scraping that generates the links in question and ultimately earns Google ad revenue. And, if they did pay on the scraping side, then charging for linking would be less relevant. As it is, they're collecting the content, but not paying on either end.
So, it seems if you agree they should pay for the scraping (but they are not), then you wouldn't be opposed to them paying on the other end. In fact, this might be fairer to Google b/c it's pay for performance.
But, more to the point, I was responding to OP's specific-claim that the news sites were attempting to "latch on to a free teat".
"scraping" usually refers to unauthorized access for the purposes of using that content for something else. what google does to generate search result pages is usually called "indexing", and websites go out of their way to encourage google to do more of it.
Understood. I use the term "scraping" loosely and, admittedly, purposely. I think it's illustrative of the broader point I was attempting to make.
But, if you really want to be precise, what Google actually does is more commonly (and euphemistically) referred to as "crawling". And, it is more accurate to say that they are crawling for the purpose of indexing what they've crawled. Crawling is essentially the front end of an overall process which ends with indexed results.
Whatever the nomenclature you prefer, the effect is the same, and so is my point.
>websites go out of their way to encourage google
I understand that some websites "encourage" Google, and that intersects with the alternative point of view I've been suggesting. That is, that Google is monopolistic in its traffic ownership and the content owners have little choice but to offer their content to be freely monetized by Google. It's also worth pointing out that there are some businesses which are built around SEO from the ground up while others—like news outlets—pre-existed Google but now rely on them to survive. These are different.
To further close the loop, my suggestion was that the original topic of this thread might also be seen as somewhat of a remedy for that effect. And, frankly, it's strange to me that you will allow that Google should be paying sites for their scraping or crawling or whatever, but don't seem to be connecting it to my point, when it really could be viewed as an alternative remedy to the problem I'm describing.
Overall, I believe mine is a more interesting and accurate way to look at the problem than to simply accept that Google has intermediated so many content sites and their consumers as some natural and universally right state of affairs.
In any case, I don't think Google's search business model or that SEO is a thing, etc. is lost on anyone on HN. And, thought I might encounter more interesting discussion here around my view. But it seems most people here have accepted that Google just owns the traffic and everybody must play along. Further, that it's really for their own good. I suppose it's become harder to imagine a world where content producers own their content and are not coerced into giving it away for need of traffic from a single monopolistic source without whom their business might not survive.
But, it's somewhat surprising when I zoom out and think about the spirit of "hacking" and the audience that used to more predominantly frequent HN. Thinking back to staunch support of folks like Aaron Swartz and other topics. Maybe I'm the only one who sees these as somewhere along the same continuum. And that's fine.
Nonetheless, I find this discussion tedious and boring by now, as I'm sure others do my "alternative perspective". So, let's just agree to disagree, rather than have these pedantic restatements of definitions and Google's well-known search business model as if these are somehow dispositive.
Let's stop trying to pick a side here... Maybe it's more of a symbiotic relationship. Google gets a useful news page, the news media get links to their articles.
I honestly could care less about either of them. I think they should just fight it out on their own and keep our legal system and tax payers time and money out of it.
>Let's stop trying to pick a side here... Maybe it's more of a symbiotic relationship.
There's definitely some symbiosis here, but it's ultimately Google that's dependent on the news (and other) sites' content, which it gets for free. That is, the news sites (and other content providers) could exist without Google. But, Google could not exist without their content.
At least that's my observation. So, I wasn't picking a side as much as earnestly asking how OP concluded that its the news sites wanting something free from Google versus the other way around.
Those sites rely on sites like Facebook and Google linking to their content to lead readers in. With the ban on Canada their public campaign has made that obvious.
>Those sites rely on sites like Facebook and Google linking to their content to lead readers in
Of course, that's the way it is. But, is it a good thing that they've intermediated all of the world's content?
It's easy to argue that the problem is exactly that a relative handful of sites have a monopoly on traffic and, what's more, they've gained that monopoly for free.
30 seconds of serious thought would tell you that your observation is wrong.
Again, a news organization can just change their robots.txt to block google from indexing their site.
They don't do that because that would instantly kill all their search traffic... and most likely kill their business.
If CNN changed their robots.txt to stop being indexed by Google, Google would literally lose 0 users.
> I wasn't picking a side as much as earnestly asking how OP concluded that its the news sites wanting something free from Google versus the other way around.
It's been explained to you several times. Instead you're more interested in acting self-righteous (it's honestly pretty cringeworthy).
>30 seconds of serious thought would tell you that your observation is wrong.
Or, maybe I just have a different opinion.
>news organization can just change their robots.txt to block google from indexing their site
You don't seem to have thought beyond this superficial robots.txt "solution". Yes, we all know that option is available. But, as I've self-righteously offered for consideration, Google is one of a few sites that essentially monopolizes traffic generation, so they've positioned themselves to make it untenable for sites to block Google's crawling (and free monetization) of their content.
Cory Doctorow has (another) recent Twitter thread on how tech monopolies have grown virtually unchecked and now abuse the ecosystems in which they operate, increasingly clawing back more value for themselves at the expense of others. IMO this fits the pattern. Look it up. You might find it interesting. Or not.
>They don't do that because that would instantly kill all their search traffic...and most likely kill their business.
And there you've just stated exactly the problem I'm referencing, with apparently zero awareness of how someone could find it problematic. I mean, you just said "they could solve the problem by blocking Google via robots.txt, but that would kill their business".
So, not exactly a solution then, right?
It's baffling that you can say this but still angrily scream that "It's robots.txt! Case closed!"
>It's been explained to you several times. Instead you're more interested in acting self-righteous (it's honestly pretty cringeworthy).
You clearly don't hear yourself. Calm down.
EDIT: out of curiosity, I just took a quick look at your recent comments to others. One of the first to pop up was this:
>What are you even talking about? That's not how SEO works in the slightest....?
It's the news sites' content that Google is scraping and monetizing. How is it not Google that's latched on to a "free teat"?
Serious question. What am I missing?
EDIT: Thanks for the downvotes everyone. I need 'em from time-to-time to ensure I've not succumbed to The Matrix.
Of course, you're all wrong. But, keep 'em coming!