Hacker News new | past | comments | ask | show | jobs | submit login

If Google scraping your sites is a bad thing, you want to set "nosnippet" tags on your page [0].

If Google scraping your sites is a good thing, then why are you complaining?

I hope Google never starts paying for the links. Once there is a precedent, this becomes an effective blocker for the new search engines, visualizers, and other exciting web search startups. A new search engine startup is not going to be able to establish a commercial relationship with every site on the web like Google could.

[0] https://developers.google.com/search/docs/advanced/appearanc...




The one issue I see with this is it is always Opt Out. I feel that google really should be lining up partners to opt-in. While I am sure there is reasons why Google believe they have the right (and a good case can be made), it always feels slightly entitled to just assume that people are OK with this being done to their content.

That being said, of all the sources, Wikipedia actively license their content in such a way that google are well within their rights to slurp it all down and serve it however they want.

Google is already effectively paying for links to news sites as part of the negotiations in Australia. And I agree that this will be a dampener on any competition, I think the era of "ask for forgiveness, rather then permission" needs to stop.


if you post information publicly on the internet, google is entitled to scrape it. you've opted in by publishing it.

if you want to specifically exclude one entity from accessing information that you've posted for anybody to see, i'm not sure how there's a way that could be "opt-in"


Google is entitled to scrape it, but are they entitled to display the content on their site, the results pages? Everything in the instant answers is content that deserves to be displayed on its creators page, along with whatever monetisation the creator chooses.


You could do this using a robots.txt file (assuming the scraper obeys it, of course).


> And I agree that this will be a dampener on any competition, I think the era of "ask for forgiveness, rather then permission" needs to stop.

Does this mean that you think there should be less competition for Google?


I similarly require that producers of motion pictures say "nosteal" at some point in the opening credits otherwise I assume I am free to make copies of the film to share with the internet.


They do, don't you remember those FBI notices in the movies? https://mashable.com/2012/05/10/fbi-copyright-warnings/

And when you sign up for netflix or cable tv, there is an agreement you accept that you are not going to pirate.

Remember, the nosnippet does not have to be on every page -- you can put into robots.txt or HTTP header, so it is literally 1 line of configuration for most web servers.

Movie producers can only dream of stopping piracy that easily.


> They do, don't you remember those FBI notices in the movies?

Oh I'm sorry I don't have the ability to look for that, my system is only equipped to look for that specific string.

> And when you sign up for netflix or cable tv, there is an agreement you accept that you are not going to pirate.

Again my system doesn't read the TOS, does Googles?

> Remember, the nosnippet does not have to be on every page -- you can put into robots.txt or HTTP header, so it is literally 1 line of configuration for most web servers.

Remember they just have to add the string "nosteal" to the opening credits. That's a few minutes in final cut pro.

Also, if they forgot to add it or have some other issue I offer no public facing customer service whatsoever.


I think you are trying to claim that Google goes further than DVD or netflix, but this analogy is really not working for you.

DVDs have technological protection as well -- the CSS[0] system. So yes, if you don't want your movie to be pirated you need to explicitly enable this. This was probably harder than creating robots.txt too, there were NDAs and stuff involved.

The netflix requires logging in to access the content. If you add the same requirement, then Google is not going to take your snippets.

Unlike the string "nosteal", the robots.txt file is not Google invention, it is as much part of the web standards as all other technologies.

If you want a website, you need a server which can support HTTP, HTML, CSS, links, robots.txt and so on. You can omit parts you don't need, but then you _may_ suffer the consequences -- without CSS your site will be ugly, and without robots.txt your site will be scraped by Google.

[0] https://en.wikipedia.org/wiki/Content_Scramble_System


The point is it doesn't matter how hard or how easy it is, Google has no entitlement to anyone else's labor or content and if they post content to their website in violation of copyright I don't think "he didn't say the magic word that stops us from stealing content" is a defence any reasonable judge should entertain.


> in violation of copyright ... defence any reasonable judge should entertain.

Now we are talking specifics! Are you implying that Google is violating the law? Given that the snippet showing has been going for a long time and no one has sued Google for it yet, it does not seem to. Plus, there is the whole Fair Use laws [0].

I personally love that I can take snippets from the random websites on the net, quote them in my posts, and not worry about copyright infringement. And if I can do this, why can't Google?

[0] https://ammori.org/2012/05/08/copyright-misunderstandings-an...


I would argue that the snippet is the thing of value being potentially abused, not the page.

So if I search for e.g. "specific breakdown of something something, in a unique breakdown format that only this website has", then the website owner has worked on, created unique/copyrighted material, and posted it on a page on their site, and Google just extracts that piece, then they might as well have "acquired" the right to host that piece of info on their search results "page".

Google "extracting" that crucial bit of info and essentially "hosting" it on their search results page could definitely be argued to be some sort of abuse of fair-use (and at this point - who is willing or big enough to take on Google on this to set a precedent? The EU, maybe? ). It's not like they're quoting a piece of a large text, they actively find the specific piece of juicy info that relates to your query and host it on their page instead of yours.


VHS/DVD's used to have these when they were around.


Movies are not public accessable. And they come with usage-rights. If you don't publish your content for all, then define the usage properly.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: