Fewer Than Half of Google Searches Now Result in a Click

cyrusshepard · on Aug 13, 2019

I think what folks are missing is that a lot of these "zero-click" searches happen as a result of Google scraping your website, and displaying the results as a "featured snippet."

Yes, they link to you below the featured snippet.

No, more people don't click, because they've taken the answer from your website and displayed it right in their search results.

For example: If I'm searching for "best nail for cedar wood" Google gives me the answer: STAINLESS STEEL - and I never had to click through to the website that gave the answer: https://bit.ly/2MdovdP

• Yes, this is good for users (it would also be good for users if Netflix gave away movies free)

• Overall, the publishers who "rank" for this query receive fewer clicks

• Google earns more ad revenue as users stick around on Google longer

Ironically, Google has a policy against scraping their results, but their whole business model is predicated off scraping other sites and making money off the content - in many cases never sending traffic (or significantly reduced traffic) to the publisher of the content.

reaperducer · on Aug 13, 2019

No, more people don't click, because they've taken the answer from your website and displayed it right in their search results.

It's for this reason that's I've stopped embedding micro data in the HTML I write.

Micro data only serves Google. Not my clients. Not my sites. Just Google.

Every month or so I get an e-mail from a Google bot warning me that my site's micro data is incomplete. Tough. If Google wants to use my content, then Google can pay me.

If Google wants to go back to being a search engine instead of a content thief and aggregator, then I'm on board.

henryfjordan · on Aug 13, 2019

I just got one of those emails for the first time about my personal site that's basically my resume. Apparently my text is small on mobile (it's not...) and some other crap

I don't get why google thinks it's acceptable to critique my site without prompting. It honestly just feels rude. They want me to do a whole bunch of micro-optimizations on a site that already works fine because it doesn't fit their standard of "high quality". I think I've gotten exactly 0 clicks from Google search results ever and I don't really ever want any.

If it were possible to get a human's attention at Google I'd start sending my own criticism their way but of course it doesn't work like that...

jefftk · on Aug 14, 2019

I was curious what it was complaining about, since https://henryfjordan.com looks great to me. I tried to run it through Google's "Mobile Friendly Test" but fetching failed [1] because your robots.txt has:

    User-agent: *
    Disallow: /

This would explain why you've gotten zero clicks from Google (or I would guess anyone else's) search results!

On the other hand, it's surprising that you would get a notification if you had crawling disabled. Did you set this robots.txt up recently?

[1] https://search.google.com/test/mobile-friendly?id=97_WUiIxx-...

(Disclosure: I work at Google, commenting only for myself)

SahAssar · on Aug 14, 2019

Google seems to see robots.txt as "more what you call guidelines, than actual rules". Sites that block googlebot or all bots with robots.txt still turn up in google searches, just without a description, and are obviously still indexed.

jefftk · on Aug 14, 2019

robots.txt is a tool to control crawling, not to specify how you would like your site to be displayed (or not) in search results. If you don't want search engines to include your site, set:

    <meta name="robots" content="noindex">

while to block just Google do:

    <meta name="googlebot" content="noindex">

See https://support.google.com/webmasters/answer/93710

If Googlebot is not respecting robots.txt, and is crawling something it's been instructed not to crawl, let me know and I can file a bug?

(Disclosure: I work for Google but not on Search, speaking only for myself)

areyousure · on Aug 14, 2019

But that requires that Googlebot be allowed to crawl the page in robots.txt in the first place.

How do you tell Googlebot to not crawl your site and to not index it either?

Previously, one could use the undocumented "Noindex" directive in robots.txt, but this will be disabled soon: https://webmasters.googleblog.com/2019/07/a-note-on-unsuppor...

tylerl · on Aug 14, 2019

The bot doesn't need to crawl your site for it to be indexed; it crawls other sites that link to yours.

You can specify your index preferences in Webmaster Tools. Don't know if there's a domain-wide off switch in there, but there probably is.

SahAssar · on Aug 14, 2019

Using Webmaster Tools is not a good option since it requires you register with the exact company you are probably trying to not interact with.

jefftk · on Aug 14, 2019

The blog post you link has a bunch of alternatives, but I agree they're not great. If there are a lot of webmasters who want to be able to noindex through robots.txt then making the case for adding noindex to the standard would be a good next step.

(Still speaking only for myself)

ademarre · on Aug 14, 2019

Googlebot actually used to support a noindex rule in robots.txt, but they are removing it.

https://webmasters.googleblog.com/2019/07/a-note-on-unsuppor...

jefftk · on Aug 14, 2019

Yes, that was linked above. It looks like this is part of reducing support to what's in the spec?

ademarre · on Aug 15, 2019

Oops, yep. I didn't see that context.

SahAssar · on Aug 14, 2019

I sent you an email, and I'm posting it here but without identifying info:

---

Hi Jeff,

Thank you for your comment, I'm replying via email to send some info I'd rather not share on HN, but will post the same redacted in HN. I used to (back when starting my web-dev career) run a one man show development team of a web agency and all our development/pre-prod sites (that had to be unauthed) had robots.txt to disallow all bots, but they still popped up in Google. Searching some of the old domains in google I found an example here: http://***.***/***, and attached is an example of it showing up in a SERP and a what the robots.txt looks like (and I'm pretty sure that the robots.txt has looked like that since that page was created).

In this case it is just one page that nobody will care about, and since I'm not working on projects that are open but "robots.txt hidden" anymore I don't know if it is as bad as it used to be, but I regularly see pages with the "No information is available for this page" whose domains have robots.txt's that disallow all bots but still show up in Google.

Please let me know if I missed anything :)

jefftk · on Aug 14, 2019

Thanks for sending the screenshot! That site shows up with "no information is available for this page", which means that while robots.txt has disallowed bots from crawling it the page is still linked from other pages that do allow crawling.

The robots.txt protocol gives instructions to crawlers about how they should interact with the site. If you instead want to give instructions to indexes, use the noindex meta tag.

SahAssar · on Aug 14, 2019

You're right, I was wrong about how to expect a "Disallow: /" to work. But isn't it sorta odd to have a protocol to control crawling (which is usually done to index) but (almost) require a compliant indexer to crawl all pages to comply with the indexing rules?

In this example the robots.txt has clearly told all bots to not crawl this site, but the only way to read the meta tag (or equivalent header) is to crawl the site. So I assume that in this case google either assumes that it is fine to crawl URL's that it has found elsewhere while ignoring the robots.txt or it assumes that pages disallowed by robots.txt are "open for indexing/linking", which would mean that any page both disallowed by robots.txt and which has a noindex meta tag would still show up, right?

What is the intended behavior if a page is disallowed by robots.txt and still linked by another indexed page? Will it get crawled or just assumed to be okay for indexing/linking? Is there any way to tell Google not to index/link and not to crawl?

erik_seaberg · on Aug 15, 2019

If you have a calendar where every month links to the previous and next months, a crawler can get stuck and hammer the server. That's the kind of thing robots.txt is for.

dorgo · on Aug 14, 2019

>"more what you call guidelines, than actual rules"

they can index without scraping. It is enough that other websites have links to you site. So the google bot follows the rules in robots.txt to the letter. "no-index" is the way to stay away from google.

SahAssar · on Aug 14, 2019

They can't read my no-index if they obey my robots.txt. Do they break the robots.txt to be able to read my no-index or do they assume my "Disallow: /" means I'm fine with them indexing/linking?

Without the noindex part of robots.txt (which google decided to ignore not so long ago) this is not solvable.

henryfjordan · on Aug 14, 2019

Oh, I just added that yesterday as a response to the email. Before that I was actually running Google Analytics but since I get basically 0 clicks it wasn't really useful.

I have a feeling the PDF viewer triggered it, cause on Mobile it defaults to showing the whole page which results in tiny text but that's easily fixed by the user so I prefer to leave it like that.

grogenaut · on Aug 13, 2019

Yeah it's amazing how rapidly and rabidly they show up when the complaint is on one of their paid features like a Google cloud (GCE) post for them or a competitor, but nada on the other products. Well no it's actually not surprising.

londons_explore · on Aug 14, 2019

Google cloud employees are encouraged to go on social media to get a feel for issues users are having and to make the product better.

The rest of Google has a policy of "Engineers will probably say the wrong thing if we let them talk in public"

kaolti · on Aug 14, 2019

Google has grown into a cancerous middleman.

michaelmior · on Aug 13, 2019

> If Google wants to go back to being a search engine

While I understand the problems with Google scraping content, as a user these snippets help me find what I'm searching for faster. If that's all you're optimizing for, Google is fantastic. There are certainly good arguments to be made for other models, but for search, stealing content helps. I'm not advocating stealing content, I'm just saying that it produces more useful results.

elorant · on Aug 14, 2019

How do you know that the content Google features is the best there is? If we stop clicking on sites and just rely on Google to provide us the content we'll go down a very slippery slope.

BigJono · on Aug 14, 2019

I don't really see how this problem is any different to 'how do we know the #1 search result is the best content there is?', if it provides you the information you want, then great, otherwise you load #2.

millstone · on Aug 14, 2019

Google lends the weight of its authority to the answers it presents. It's one thing if Infowars says that Obama is planning a coup against Donald Trump, it's another if Google says so.

w1nst0nsm1th · on Aug 14, 2019

Try googling "root M89 tablet".

The first three result lead you to fake android blog telling you how you can easily root every chinese android device and specifically the M89 tablet...

The real authoritative result (xda-developers) only appears in the fourth position, under sight. It will tell you if you follow the instruction given in the fake blog post from the 2 or 3 first results, you will brick your tablet.

In a similar way the word "cbd" (for cannabidiol) has been hijacked by dubious commercial compagnies through fake blog posts filling pages after pages of google results telling you how great cbd is for the treatment of every disease on earth... But there is no trace of an actual study in these results. You will have to go with the less popular word "cannabidiol" to start to see some serious articles about it.

Google results can be hijacked and Google do little about it. May be because the ads shown in these fake blog posts are from google ads network ? I don't know...

But google result have clearly deteriorated these last years and the authoritative figure of the companie is not anymore what it was in the past.

playpause · on Aug 14, 2019

I know that sort of thing happens sometimes (Google presenting a spurious statement as a categorical answer) but those are bugs. As long as they are very rare, and fixed quickly when they occur, I don’t see them causing much harm.

OK, some people believe anything they read (especially if it confirms their existing biases), but that problem has always existed. I think Google’s occasional snippet fuck-ups are a drop in the ocean compared to the spread of false information through social networks.

millstone · on Aug 14, 2019

There's the modern news-cycle axis, where Google can and should devote full-time engineers.

But the long tail is important too. It's fixed now (yay) but for years you could search for "calories in corn" and Google would confidently present an answer 5x the true value, scraped from a site with profoundly wrong information. As Google moves to present more direct answers and fewer links, this risk increases.

It looks like they have backed off on the direct answers somewhat which is good news.

DollarGuru · on Aug 14, 2019

If it undermines the websites producing the content Google is scraping by not sending through traffic then those sites may not continue to exist.

londons_explore · on Aug 14, 2019

This is already happening.

Very few new blogs and content websites are being set up.

All content is moving into apps and walled gardens. Part of the reason for that is that running a well researched blog will never pay for your time, so becomes a hobby thing, and most people are fine to use Facebook for that.

wutbrodo · on Aug 14, 2019

> Micro data only serves Google. Not my clients. Not my sites. Just Google.

Well it also serves Google's users, to be clear. Though I should also be clear that I don't think that justifies it, since I think it's bad for the ecosystem in more subtle ways than are expressed in immediate user satisfaction.

tremon · on Aug 14, 2019

That depends on how you define "users". If you define a website creator also as a Google user (by virtue of wanting to be found through Google), then Google is serving part of its users to the detriment of their other users.

And if you view Google instead as a connection broker, e.g. a middle-man between publisher and consumer, then Google is destroying their own business by snubbing publishers. Assuming that Google is still making rational, intelligent decisions, it follows that Google no longer sees itself like that.

dyarosla · on Aug 14, 2019

Did Google ever see itself as prioritizing publishers and consumers equally? I think that’s a false premise and the parent is right; Google’s priority has always been consumer first.

dragonwriter · on Aug 14, 2019

> If Google wants to go back to being a search engine instead of a content thief and aggregator

A search engine is inherently a content aggregator; the functions are inseperable.

antonvs · on Aug 14, 2019

Not necessarily. Google used to be more of a link aggregator. There's a difference, as the OP proves.

dragonwriter · on Aug 14, 2019

Google (and virtually every other search engine) has always included content with links, what's different now (but not unique to Google, though they are perhaps the most advanced at it) is that now it algorithmically synthesizes content instead of merely aggregating it.

throwawayjava · on Aug 14, 2019

It does help your clients.

I mean, maybe not yours specifically. But snippets are great for users in the typical case.

pyrale · on Aug 14, 2019

These users are no longer his clients.

sli · on Aug 13, 2019

On top of all that, Google's snippets aren't curated and therefore, aren't always correct. They can be (and almost certainly are) gamed. Users that don't click through open themselves up to carrying on being misinformed.

squeaky-clean · on Aug 13, 2019

I've found them to be incorrect so often on things when I would click through to the actual page or find a better link. I don't trust just the blurb for any answers any more.

reaperducer · on Aug 13, 2019

I don't trust just the blurb for any answers any more.

I don't, either.

A site I used to own had a discussion forum on it. It contained a message along the lines of "Real Estate Agent X is a great guy. Real Estate Agent Y is a complete sleazebag."

The blurb that Google displayed for it was "Real Estate Agent X... is a sleazebag." And that was the first result for anyone who searched for that agent's name.

As you can imagine, I received many angry e-mails, phone calls, and legal threats. No, you can't explain to angry people that it's "just" an algorithm that told the world that they're a sleazebag.

I ended up editing the post so that Google would display a different version after its next scrape.

londons_explore · on Aug 14, 2019

I think there's more to this... Google use lots of fancy Natural Language Processing stuff to extract that data, and unless the wording was very tortuous, I doubt it could make such a big mistake by chance.

ribosometronome · on Aug 13, 2019

They can get it painfully wrong last time. I came down with something like optic neuritis a few years ago. It's often one of the first signs of MS in many folk. When I googled something like "MS life expectancy", the blurb said something like "3-7 years" -- with subtext indicating it's 3-7 years LESS than average rather than "you're kicking it in 3 years".

Turns out I didn't have optic neuritis.

benoliver999 · on Aug 13, 2019

They suck. And something about the way they are presented seems to make people believe them.

I think it gives that one-shot answer to questions people have, even when the real answer is nuanced and multi-faceted.

moksly · on Aug 14, 2019

I think they’re believable because google started by providing things that weren’t wrong. If you search for a time zone google shows it in your local time, if you search for currency conversion google does that. All those things that it’s done for ages, which were things that were also typically correct.

Then the snippets show up, and they are presented in a similarly trust worthy fashion. But the snippets are really just the really just the result of which ever site has the best SEO, and that’s often a really worthless metric these days. The time zone and currency stuff is easy, because it’s math, but opinions aren’t. The thing is though that even if google didn’t have the snippets, those sites that gets snippets would still be the top results that we clicked, and we’d still get the wrong information. That would probably be better, because it might be easier to spot obvious bad sources, but I still think there is just a fundamental flaw in how SEO professionals have learned to game the google bot to bring the world useless information.

I mean, part of it is certainly on google. No one in their right mind wants to comply with Google’s ranking terms, unless you make money from google searches. Which means a lot of useful personal blogs have dropped off the face of the internet, unless you’re really lucky to see them linked on a place like HN.

I wish libraries would band together and make a privacy focused and curated search engine, because librarians are actually kind of good at finding you the correct information.

sharatvir · on Aug 14, 2019

It sucks. Sometimes the bold text is the exact opposite of the answer to the query I search for. It’s very misleading unless you click through and read the full context.

meowface · on Aug 13, 2019

Yeah. I personally like the feature, in theory, as an end user, but the signal:noise ratio for it has not been great for me.

rhizome · on Aug 13, 2019

This is especially true where the answer is time-bound, which happens a lot in technical topics. Many times the snippet is for an earlier version of the language (but still with a high PageRank), or the Operating System (especially Android settings), and the most annoying at all: an ancient answer in an undated blog post.

londons_explore · on Aug 14, 2019

Google is good at dating undated content. They keep track of the first time they've ever seen a bit of text, and assume it was composed then, even if it later gets copied to other sites.

pbiggar · on Aug 13, 2019

For a recent search "report amex card stolen", google showed a phone number for a scam who asked for a social security number as soon as you called.

concert-gilled · on Aug 13, 2019

The websites that the results aren't curated either. Clicking through to the site could provide the same incorrect information.

perl4ever · on Aug 13, 2019

The point is that Google frequently adds another level of incorrectness, that may not be identifiable without checking the source. This is pretty common on Wikipedia, and when people link to things in discussion forums, as well.

And anything Google does, is done at vast scale, which makes me, at least, think it might be substantially affecting society.

mcv · on Aug 13, 2019

But that's the responsibility of that website. Of course it's bad if Google lists a site with wrong information as the first hit, but I think it's worse when Google blindly copies that false info and lists it as their own zero-click result. By doing that, Google itself takes responsibility for the information.

Although sometimes the site is actually correct and Google still gets it wrong by copying the info incorrectly or losing some context or qualifiers.

I loved zero-click results back when DucfDuckGo first introduced them, but I'm less enthusiastic about Google's implementation of them.

buboard · on Aug 13, 2019

sometimes the blurb just has an answer to a different question. Websites are curated, unless its spam.

dragonwriter · on Aug 14, 2019

> Websites are curated, unless its spam.

Yes, but even when they are curated the curators are usually unreliable and sometimes malicious.

buboard · on Aug 14, 2019

snippets are just a reflection of that. how is google faring better in that respect?

wolco · on Aug 13, 2019

Those are sites google chooses are correct.

sameers · on Aug 14, 2019

For example, this WaPo story, about YouTube videos for some medical queries that go to videos featuring quack remedies and anti-vaxxer misinformation.

https://www.washingtonpost.com/lifestyle/style/they-turn-to-...

dragonwriter · on Aug 14, 2019

> On top of all that, Google's snippets aren't curated and therefore, aren't always correct.

The “therefore” is misplaced; curated snippets aren't always correct, either.

minor3rd · on Aug 13, 2019

People on the web take the risk of being misinformed, clicking or not.

charlesju · on Aug 13, 2019

It's important to note that this is strategically incredibly important for Google because this forms the backbone of their voice AI. The better at answering questions directly, the better their voice AI becomes and that leads to a lot of future products.

soup10 · on Aug 13, 2019

AdWords is and always has been the goose that lays the golden eggs, none of Google's other initiatives have ever rivaled that revenue. That's why they put so much effort into bolstering and optimizing their search results pages.

HenryBemis · on Aug 13, 2019

Another reason is the use of add-ons such as: "Google search link fix - Prevents Google and Yandex search pages from modifying search result links when you click them."

I have stopped using Google a few years ago, but just in case I keep this (or similar) add-ons of my Firefox.

I have no idea of the popularity of such addons, but they would also impact the tracking that Google does.

igravious · on Aug 13, 2019

Oh my God! This is so useful! I hate that I can't right-click on a search result to copy a URL. We definitely used to be able to do this, didn't we?

propogandist · on Aug 13, 2019

It's been this way for ages, although for chrome (iirc) this is managed via hyperlink auditing [1] which allows google to track what you're clicking even though the link appears 'clean'.

The click through google redirect also allows them to track things like relevancy of the content and time on site (if you return to google SERP by clicking the back button), in-case the target site isn't using google analytics (unfortunately most sites do).

[1] https://html.spec.whatwg.org/multipage /links.html#hyperlink-auditing

Hyperlink auditing can be blocked with uBlock Origin / uMatrix

jefftk · on Aug 14, 2019

Hmm, right clicking and copying works for me in Chrome and Safari. I just tried searching for "test" and the first result is marked up as:

    <a href="https://www.speedtest.net/"
       ping="/url?...>

Looking at https://caniuse.com/#feat=ping it looks like ping is supported in Chrome, Safari, and Edge, but not Firefox; are you using Firefox?

(Disclosure: I work for Google)

igravious · on Aug 14, 2019

I use both. I'll use this "Google search link fix" extension in Firefox until search results links aren't proxied.

wbl · on Aug 13, 2019

Don't like the product? Switch. While you still can.

jefftk · on Aug 14, 2019

Any search engine is going to want to know what people click on so they can make their product better. For example, I just searched for [test] on DuckDuckGo and when clicking on the first result I see DDG sending a ping back:

    https://improving.duckduckgo.com/t/lc?...

which contains which URL I clicked.

(Disclosure: I work for Google, speaking only for myself)

Hitton · on Aug 14, 2019

That's not true, for instance Startpage doesn't do that.

jefftk · on Aug 14, 2019

Startpage is an anonymizing proxy for Google Search, not a full search engine. Crucially, it doesn't determine how to rank results. If they decided to try to compete with Google, Bing, Yandex, DDG etc directly by bringing ranking in-house they would have a very hard time serving good results without being able to track which of their links were popular among users.

TheArcane · on Aug 13, 2019

I consider myself privacy conscious and have add-ons like muli-account containers, cookie auto-delete, UB Origin and Privacy Badger working in tandem.

It's embarrassing that I wasn't aware of this extension, given how useful it seems - thanks!

IGotThroughIt · on Aug 18, 2019

How safe are all these plugins we install to escape tracking? Are we trying to escape big tech tracking only to hand our information over to extension developers? Looking at network traffic often shows a ton of extensions sending data to some aws server almost perpetually.

Asking because I'm not sure of the answer to this question and lately I've become even warier so I decided to uninstall everything except things I absolutely must have like colorzilla, grammarly and full-page screen capture. For adblocking I use brave and never ever touch firefox, opera or chrome.

There's an extension that appends a share=1 parameter to all quora links to prevent them from forcing you to sign in in order to view a post. I like it but I'm trying to minimize my extensions footprint and I'd rather write my own script to perform the same script.

The question is, how do you get to be sure that an extension is safe?

majani · on Aug 15, 2019

Then the snippet should just be used for voice search. And websites should opt in to the program.

arcturus17 · on Aug 14, 2019

> Google has a policy against scraping their results, but their whole business model is predicated off scraping other sites and making money off the content

Yea a couple days ago I was checking the Places API, which they’ve built off user-generated content and scraping Yelp and others. They charge $17 / 1000 calls for certain items and don’t you dare cache anything for too long.

Great way to build a business: get data for free, wall it off and put a hefty price tag on it, then put your best lawyers around the moat for good measure!

londons_explore · on Aug 14, 2019

I downloaded all the places data for the world while it was still free. In my jurisdiction, the data is considered owned by the place owners rather than Google, so I doubt they'll come after me.

sverige · on Aug 14, 2019

That's the ancestry.com business model as well.

gniv · on Aug 13, 2019

I disagree. There is an implicit contract between website publishers and search engines that it’s ok to do this. The website can set nosnippet in robots if they want to not have the snippet in search results.

zymhan · on Aug 13, 2019

So by having a website, I implicitly agree to Google's search practices?

That doesn't seem right.

njharman · on Aug 14, 2019

You put a resource on an open network and don't use any of the standard, recognized methods to indicate don't index, don't share, (nor lock it away with auth).

It's like if you put a sculpture in front yard and get upset when someone points it out in their neighborhood tour company, even worse cause yard ornaments don't have standard accepted methods of saying "don't use".

Two choices

1) use robots.txt

2) don't put it on the internet

Silhouette · on Aug 14, 2019

You put a resource on an open network and don't use any of the standard, recognized methods to indicate don't index, don't share, (nor lock it away with auth).

This is the kind of argument people used to use as they flagrantly violated your copyright by cloning your article on their own site. "You put it on the Internet, so it's free for everyone to copy."

The law says no such thing, at least not in any jurisdiction that I'm familiar with. Contrary to popular belief in some quarters, normal laws do still apply on the Internet.

If you infringe copyright, it's still infringement even if what you copied was freely available on someone else's site.

And if you state something that is misleading and harmful, it might still be defamation, even if what you stated was just an automatically generated snippet that takes a small part of someone else's site and shows it out of context.

eitland · on Aug 14, 2019

Nah. Take it easy here, there is a long way between indexing and showing the most relevant hit and outright lifting big parts out of the site and use them on their own property:

It is more like if the guide that used to send visitors to your property has set up their own boot on the best spot on the sidewalk next to you and are raking in money because of the useless (often, in the last few years) ads they have plastered all over it.

Even if it is an educational non-profit resource you don't want that as some of the details get lost when visitors only reads the guides summary instead of taking a closer look for themselves.

And according to people on this thread they will also complain and/or come with suggestions about how you can make it even more useful to them.

didibus · on Aug 14, 2019

I think of it more as if you put a banner with content somewhere in the public, and I take a photo of it, what can I later do with that photo?

And for that, it's a question of copyright. It turns out, in the US, if something is publicly available it does not make the copyright a part of the public domain. Thus the original author still retains copyright unless explicitly stated otherwise.

There is an exception to this though, which is called fair use. And for that, I'd recommend reading this: https://amp.theatlantic.com/amp/article/411058/ Book snippets by Google searched were deemed fair use.

So the question remains, would website snippet similarly count as fair use? What will the federal courts rule be? And when it comes to fair use, that's the only way to know if it is or not.

Silhouette · on Aug 14, 2019

It's worth pointing out in this context that the US legal concept of fair use is not universal. In fact, unusually for US IP laws, it's actually much more permissive than most other places. The more usual practice is to enumerate specific situations where copying without the copyright holder's consent is still allowed, instead of defining general tests, which is how fair use works. This has been a controversial point, because it's not clear that the US scheme is sufficient to meet its obligations under international treaties.

In answer to your final question, I'm not sure whether this use of snippets in search engine results has been tested in any US courts yet, but the issue of search engines showing enough content from the sites they link to that users never actually go through to the original site is sufficiently controversial that the EU's recently passed copyright directive includes specific provisions aimed at exactly that sort of situation.

kabacha · on Aug 14, 2019

> It is more like if the guide that used to send visitors to your property

Here is where your argument falls apart. The web is a public space - it's not your property or your front yard. It's more akin to going to the town square wearing a fancy hat and getting upset if people look at you and your weirdly shaped headwear.

tremon · on Aug 14, 2019

The web is a public space - it's not your property or your front yard.

You're wrong here. Just because it's a public space does not mean nobody owns the property. As a simple example, a shopping street is usually a public place. That does not mean that all window displays, doorways and adjacent buildings are automatically a free-for-all.

In fact, only "the tubes" of the web are a public space. The rest is owned property, even if there are no visible fences.

pyrale · on Aug 14, 2019

Laws everywhere are pretty much saying your take is wrong. There is no such thing as an implicit contract, and your take on it is plain victim blaming.

It is very surprising to read this on a board where many people write code: if a dev found unlicensed code, they would certainly not think it is public domain.

cyrusshepard · on Aug 13, 2019

It's a devil's bargain. If you opt-out of snippets, it simply means somebody else claims the top spot, and you are left with even less traffic (by a significant amount)

gniv · on Aug 13, 2019

> If you opt-out of snippets, it simply means somebody else claims the top spot

Citation? I thought snippets are just for display, not ranking.

ribeyes · on Aug 14, 2019

Snippets link to the source URL, so getting into the snippet gets your link to top of the page.

dessant · on Aug 13, 2019

You don't have to inform anyone about your content not being redistributable, that is not how copyright works.

buboard · on Aug 13, 2019

> nosnippet

TIL. That's actually a good idea. Does that eliminate all kinds of snippets? NOARCHIVE may also be of use.

vageli · on Aug 13, 2019

> I disagree. There is an implicit contract between website publishers and search engines that it’s ok to do this. The website can set nosnippet in robots if they want to not have the snippet in search results.

Who made this contract? I never signed one. If I came to your place of business and copied your content and provided it somewhere else, I would be infringing your copyright. Do I have to put up signs specifying that at my place of business? Why is this any different? My web content is not the property of someone else and by publishing my information that is in no way an implicit grant of the right to reproduce it.

gniv · on Aug 13, 2019

I believe citing small pieces from a large text is covered by fair use.

mthoms · on Aug 14, 2019

It depends. One of the criteria for acceptable "fair use" is that the usage shouldn't negatively affect demand for the original source.

Although there are other criteria to consider, Google's snippets clearly violate that particular tenet.

See #4 https://fairuse.stanford.edu/overview/fair-use/four-factors/

milesskorpen · on Aug 13, 2019

It's a faustian bargin. Google is so powerful you can't do without them, but they're also inexorably eating your future.

Silhouette · on Aug 14, 2019

Google is so powerful you can't do without them

I wonder how true that assumption really is any more. The quality of traffic Google drives to sites I operate is very low compared to all other major sources, with much less engagement by any metric you like, notably including conversions. The only reliable exception is when we're running marketing campaigns in other places, which often result in spikes in both direct visitors landing on our homepage and search engine visitors arriving at our general landing pages.

There is this conventional wisdom that SEO, and in particular playing by Google's rules to rank highly in its results pages, is the only way you can run a viable commercial site these days. Our experience has been exactly the opposite: our SEO is actually quite effective, in that we do rank very highly for many relevant search terms, but it makes a relatively small contribution to anything that matters. And really, when I write "SEO" here, I'm only talking about general good practices like being fast, having a good information architecture and working well on different devices. We don't change the structure of our pages just because Google's latest blog post says X or Y is now considered a "best practice" or anything like that.

Of course I have no way to know how representative our experience is. YMMV.

milesskorpen · on Aug 16, 2019

It is a very significant part of our business.

scarface74 · on Aug 14, 2019

Yes you can. There are other ways to market yourself and your website. For instance, the author of “Fearless Negotiation” has appeared in four or five podcasts I follow. The well known pundits in the Apple ecosystem grew an audience organically through word of mouth.

Hoping to stand out on Google results as a business plan is recipe for failure. You are one algorithm change from going out of business.

buboard · on Aug 14, 2019

> There is an implicit contract

Then why can't publishers scrape google?

mkl · on Aug 14, 2019

From http://www.google.com/robots.txt:

    User-agent: *
    Disallow: /search
    ...

wolco · on Aug 13, 2019

It should be opt-in.

scohesc · on Aug 13, 2019

So they're like a modern ebaums world for the information age.

Interesting way to put it - the biggest bully with the most money wins!

buboard · on Aug 13, 2019

Funny to read in the html:

- This site is optimized with the Yoast SEO

- This site is optimized with the Schema plugin

Yeah, optimized to death

ineedasername · on Aug 13, 2019

Glad the first ranked response was this. It's what I came here to say. These days you simply don't need to click as often to get what you need out of a search, and Google's business model doesn't rely on click through to web sites, but for display & click through of ads.

bartimus · on Aug 13, 2019

I'm still on the fence somewhat.

Searching for "best car engine oil" has certain brands displayed straight on the featured snippet. Who cares about the click if Google found your customer for you and got your message through for free?

tremon · on Aug 14, 2019

In the end, Google should care. If a search for "best car engine oil" got your product featured, that means you won a sale. But assuming the sale happens completely offline, Google lost its opportunity to inform you of the search, and of the succesful search->sale conversion.

That means your marketing department can no longer justify investing money in Google SEO, which means less optimization towards Google's crawler, which means less reliable search results, which means less Google searches in the long run.

bartimus · on Aug 14, 2019

Increased profits from unknown sources VS decreased profits from known sources. The gained marketing intelligence may come at the cost of the bottom line.

aviraldg · on Aug 13, 2019

Feel free to add them to your robots.txt; they won't scrape you then (but they won't index or rank you either)

autokad · on Aug 13, 2019

I 0 click search more than I click because but not limited to: 1 ) to get the correct spelling of a word that spell check cant find a suggestion 2 ) avoid going to a site that I might potentially get malware (example searching music lyrics) 3 ) avoid having to deal with slow loading and bloated pages

Illniyar · on Aug 14, 2019

"• Google earns more ad revenue as users stick around on Google longer"

This one is actually reverse. Google search doesn't net google any money if people don't actually click the link, since ad revenue for google search is Per Click, not per view (per mille).

The incentives for them are actually reversed - increasing the amount of clicks into external websites, specifically advertised links, increases their revenue. (which is why there are so many advertised links on a search page)

tabtab · on Aug 15, 2019

I do a fair amount of grammar and spelling searches. Google often displays tips and examples. And typing "sp500" displays a stock chart right in Google itself. Google has a lot of "instant snippets" like that. Quite convenient. However, near-monopolies do make me nervous about supporting them.

ses1984 · on Aug 14, 2019

Does it matter if they have a policy against scraping? I thought that was explicitly legal, which enables their existence.

johnx123-up · on Aug 14, 2019

As I mentioned before in HN, this guy predicted "Google SEO bubble" http://rajeshanbiah.blogspot.com/2018/01/technology-predicti...

mxd3 · on Aug 14, 2019

Great point, this was my first thought also. Google has been doing a slow creep of this type of content for the past through years through the featured snippet you mentioned, and other knowledge panel material. They now serve sports, weather, math, translation, flights, etc.

singlow · on Aug 14, 2019

I actually searched for best nail for cedar this afternoon :-P. I clicked several of the articles though...

epiphanitus · on Aug 14, 2019

Speaking of scraping, does anyone know where one can get a hold of full text news articles/press releases for nlp research? Most APIs that I have found only offer partial texts.

I know that Aylien has an API for this but it's out of my price range.

mrtksn · on Aug 13, 2019

If I recall correctly content(news especially) publishers and some Europeans were very angry about that. I think the consensus was that these businesses don't understand the internet.

buboard · on Aug 13, 2019

in that case the news sites did get the click, but they wanted more

mrtksn · on Aug 13, 2019

How do they get the click? If so, what is the fair amount of clicks that businesses should get?

buboard · on Aug 14, 2019

When i go to google news, there are no snippets, just titles linking to newspapers

dplgk · on Aug 14, 2019

Why is this is not copyright infringement?

amelius · on Aug 13, 2019

Shouldn't they be liable for mis-information? Wouldn't that solve the entire problem?

mikeg8 · on Aug 13, 2019

I don’t see any way to actually achieve this at scale let alone any reason to add an opening for more pointless lawsuits. Let’s say they’re liable and you choose to act on incorrect information recieved for free. Do you really try to take them to court and on what grounds?

goatinaboat · on Aug 13, 2019

Yes, Google and similar companies should be 100% responsible for anything published on their platforms. No more “safe harbour”. They have chosen to take positions in many issues, that makes them more like newspapers than phone companies.

akersten · on Aug 13, 2019

Positions like what? And no, banning radicals from their platform for violating their terms of service is not a position.

Even if they were responsible, it's still legal to lie. You don't see pseudoscience websites being taken down because they are objectively false either.

goatinaboat · on Aug 13, 2019

It’s OK for the NYT to attempt to “prevent another Trump situation”. They have an editor and that person is legally responsible for what they publish. They don’t even pretend to be non-partisan. But Google takes a position then hides behind “common carrier” status. It’s not reasonable that they can pick and choose. Either they’re the phone company or they’re a publisher. It’s their right to be either of course, but they must choose.

Similarly for Twitter.

root_axis · on Aug 13, 2019

> It’s not reasonable that they can pick and choose. Either they’re the phone company or they’re a publisher. It’s their right to be either of course, but they must choose.

This is 100% wrong, the opposite is true. The law explicitly protects website operators from being liable for content posted by 3rd parties while simultaneously granting them the explicit freedom to curate content that they deem objectionable.

pyrale · on Aug 14, 2019

No content on Google is posted there by 3rd parties. Google does select what is displayed and, in the case of snippets, they go out of their traditional way to promote that content.

root_axis · on Aug 14, 2019

The use of the word "post" is my own colloquially imprecise language, the law actually states

> No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.

So content indexed by google absolutely falls under the definition of "provided by another information content provider"

pyrale · on Aug 14, 2019

Providing links is indeed within this definition. However, cards go beyond that: by selecting one out of the many results promoting it and possibly alterating its meaning by selecting which parts, and how it is displayed goes far beyond merely displaying content provided by others.

Of course, there is plenty of room for google attorneys to wiggle, but in the end the objective for them is to 1) give credibility to a source and 2) to get the benefits of being the providers of information.

akersten · on Aug 13, 2019

Common carrier and safe harbor are 100% separate and distinct concepts. The same way that a forum could have a theme ("political party X posts only") and still be allowed to remove illegal content is Safe Harbor (both curation at their discretion and no responsibility for illegal posts) - and I don't see how one could be against that - and Google is nowhere near that, whatever "positions" you envision them to have taken.

Google and other tech never claimed to be common carriers, and even internet service providers have been cleared of that status - barely anyone is legally required to transmit without discretion (it's pretty much just phone companies). So why make it about Google and Twitter, and start with ISPs?

goatinaboat · on Aug 13, 2019

Stay tuned for the next batch of revelations from Project Veritas.

root_axis · on Aug 13, 2019

They're not like either. If anything they're like a phone book for URLs instead of phone numbers.

goatinaboat · on Aug 13, 2019

Except this is a phone book that sorts not alphabetically (no pun intended) but according to its own interests. No phone company ever did that.

root_axis · on Aug 13, 2019

Of course it doesn't sort the internet alphabetically, that'd make no sense and be a bad user experience as well as optimize for URLS starting with A.

goatinaboat · on Aug 13, 2019

I don’t mean literally alphabetically but according to some objective measures. In the old days it was by incoming links (PageRank). But now it is opaque and many people are finding that it orders by whatever is best for Google, not for the user.

root_axis · on Aug 13, 2019

There is not nearly enough room on the front page for everyone that wants to be there, google has to make subjective decisions about what shows up there, it's impossible to do it any other way.

amelius · on Aug 13, 2019

They could randomize it. Allow everybody to be on the front page an equal number of times.

Dylan16807 · on Aug 13, 2019

Then it's spam farms as far as the eye can see, because they can enter a hundred times as much as everyone else.

root_axis · on Aug 13, 2019

I don't think that's a bad idea, but the vast majority of google users would not desire this behavior, especially the way google is used today where users try specific terms to relocate content they have looked up before.

scarface74 · on Aug 14, 2019

It’s called the Yellow Pages. The more money you spent, the more noticeable your business was.

On top of that every Locksmith and towing company had names like “AAAAA Aaron’s Locksmith”.

goatinaboat · on Aug 14, 2019

Yet there is no spam in the Yellow Pages. It’s very unlikely that if you call Aaron he’ll clone your credit card or install hidden cameras in your house. Also, it’s very likely that he actually is a locksmith, has the accreditation he claims to, is a legitimate business registered at Companies House, fully insured, all the things you expect of a normal business.

tylerl · on Aug 14, 2019

Hold on. Google doesn't earn even a penny when you visit their site, find your answer on the search results page, and then leave. That user behavior COSTS Google money, it doesn't earn anything.

If they were trying to monetize you they'd show you an ad that links to your answer and take a profit on the click. Directly giving the user the answer they want is great for the user, but guarantees that Google won't earn any revenue.

So why does Google do it? Simple: because their competitors do. That's the free market for you. Google didn't start that feature, another competitor did; Microsoft made it their primary differentiating feature in fact (remember the "bing and decide" ads?). Google had to adopt the same behavior or lose their customers.

So no, don't blame Google, blame capitalism. This is precisely the kind of feature that you wouldn't get if Google was able to behave as a monopoly.

philipov · on Aug 13, 2019

I think "hypocritical" is a more appropriate description than "Ironic"

OrgNet · on Aug 13, 2019

that is correct and should be considered copyright infringement... I am so tired of the double standard in the US of people VS corporations... Corporations are considered better people then real people.

milesskorpen · on Aug 13, 2019

This is good for users ... for now.

But as Google sucks up the consumer surplus, it's going to be harder and harder to make money from internet businesses, and the final result a few years down the road will be toxic.

The internet isn't going to work too well if its solely reliant on hobbyists.

wolco · on Aug 13, 2019

They could but the hobbists sites are no longer in the serps.

megaremote · on Aug 14, 2019

The funny thing is this used to happen. In the early days, you ask a simple question, you would get the answer in the search results, before they introduced feature snippets. The problem was, because no one was clicking on these useful sites, they were downgraded in the listings to sites that hid the useful info so you had to click on it.

ocdtrekkie · on Aug 13, 2019

This is the subtle truth that I've seen a few folks on Twitter talking about for the past year or so: That Google has slowly but steadily reduced both the outbound clicks to other websites, but also the portion of their revenue that's based on ads hosted by other websites, while bringing both the "results" and the ad placements in-house, where they no longer have to pay out a share to site owners.

Whereas Google was previously a way for sites to be discovered and for sites to generate revenue, it is increasingly becoming the sole source system where data is scraped and imported into Google, and Google keeps all of the revenue to itself.

iClaudiusX · on Aug 13, 2019

The increasing sprawl of non-search widgets invading the search result page reminds me of the AOL years where "the web" was funneled through a narrow portal controlled by one entity.

Having to scroll down past ads, unrelated news, unrelated youtube videos, and ever more of these info boxes has pushed the actual content I'm looking for out to the second page. It's made it much easier to use ddg as default and use the !g flag only when absolutely necessary.

judge2020 · on Aug 13, 2019

And the downfall of AOL was Google because Google had a better product. In order for google to fall, you need a product that's better (or more of what people want; you need to give them a reason to change their search engine).

anotheracc · on Aug 13, 2019

I have no trouble believing Google is going to fall.

Their results have gone into the toilet - I ragequit Google search about once a day and do something else like forum searches.

prawn · on Aug 14, 2019

I stick with Google but increasingly try to tweak my searches to hit forums. They're just that much less likely to be made-for-AdSense content by a copywriter paraphrasing other information from the web.

Ruthalas · on Aug 14, 2019

Do you have any tips for this sort of search tailoring? It's something I try for, but I've yet to find any particularly good keywords to leverage.

jamesmccann · on Aug 14, 2019

Usually I add "reddit" to the search phrase and try to find threads / user-generated and hopefully more organic content this way.

coldpie · on Aug 14, 2019

reddit is good, and so is just "forum" which will turn up specialty forums that haven't been absorbed by one of the Borgs yet.

prawn · on Aug 14, 2019

I often just add "whirlpool" which is a fairly reliable Australian forum that started covering telcos but these days will have things about cars, home maintenance, personal health, etc. Or I add "forum" and that can be enough to tilt the results.

machiaweliczny · on Aug 14, 2019

I usually add site:reddit.com or site:news.ycombinator.com etc. Actually google had a way to search discussion gruops, but they removed this feature as forums don't pay for their adds I suppose.

kryogen1c · on Aug 13, 2019

Same. I cant tell if my questions are just getting more specific and technical, but Google search results have been getting pretty useless in the past year or so

x0 · on Aug 14, 2019

I love how google likes to completely ignore what I'm trying to search for. I wish I could think of an example because it happens to me often, but I can't so I'll make one up.

Imagine you're searching for tail lights for your car or something, but you don't know the size, so you search "Astra tail light size". This might bring up headlights. Wrong but no matter, you'd go on to google "Astra tail light size -headlight -head" or something.

What Google seems to have been doing to me recently is ignoring those negated terms, ignoring quotes, and just giving me the same results again and again. It's really getting annoying. Google seems to assume it knows what I'm looking for, and that my search query is just completely wrong and not what I want.

Note that the car stuff is just an example, I'd expect Google to not give you headlights the second time. It generally but not exclusively happens to me when searching things that are more technical. ESPECIALLY when it's a consumer level thing I'm trying to get info on, Google likes to assume it's giving you errors and you're trying to fix it. Which makes sense for most users, but god it's frustrating when every combination of advanced search parameters you try does nothing!

Google search needs a checkbox or something to turn off it's cleverness and just do an actual search.

glangdale · on Aug 14, 2019

Absolutely. I was trying to figure out something to do with timeouts in an SMT solver called Yices, so I had search strings about signals and alarm and Yices - of course. Google decided that this was a generalized programming question and displayed a lot of stuff about signals and alarm handling that didn't relate to Yices.

How likely a search time is "Yices", ffs? Feels like something that exotic ("statistically unlikely") probably is meant to be in the results by default.

drivebycomment · on Aug 14, 2019

I had no idea what Yices is. So I Googled it - the first link is SRI's Yices SMT solver. I tried "yices alarm" "yices signals" "yices timeout", and all of them showed only links related to Yices in the first result page (various manual pages, types, etc). So my attempt at reproducing your experience has failed.

computator · on Aug 14, 2019

The top Google hit for "yices alarm" is currently the exact Hacker News comment you've replied to. I wonder if Google adapted its search results based on that very comment? Maybe their algorithms shrewdly give more weight to fixing search results when the context mentions Google ("I googled for...", "Google didn't work when...", etc.) and the site is high profile (like HN). That would be very crafty.

O_H_E · on Aug 14, 2019

I am kind of happy and sad in the same time to know that it is not just me.

This is SO annoying.

mrhappyunhappy · on Aug 14, 2019

Same here. It seems the websites that show up top are become more and more spammy and less relevant to my query. I keep seeing all sorts of one sentence hipster 2.0 sites that want me to believe they are a credible source of information.

dredmorbius · on Aug 13, 2019

... and that product will likely repeat the cycle again, on some schedule of another. Might be the 20+ years of Google, might be the few years that Medium was only modestly annoying, might instantly go to shit.

The problem is the business model breeds for this, and we end up replacing one abusive monopoly with another, until we can break that cycle.

For a time it seemed Free Software might ... free us ... from that, though as even that effort's biggest boosters (Eben Moglen, Bradley Kuhn, RMS) freely admit these days, we've been regressing of late, and at an increasing rate.

What's it going to take?

t0ughcritic · on Aug 15, 2019

Searching symptom tracker, google tells me no no you mean symptom checker. No sorry I do need a way to track them not check them

buboard · on Aug 13, 2019

Anecdotally, i am noticing a steady decline in adsense RPMs for the exact same audience over past 2 years, and adsense is now most of the time not filling the ad slot.

Also, it has gotten very hard to rank in google for new sites. SEO blackhat tactics rule, and even local businesses use them. Google went from win win to win-get lost.

Sadly, now that webmasters need it the most, investments in alternatives to search and advertising have dried up. There is almost nothing except G.

ALittleLight · on Aug 13, 2019

I feel the same way about ranking new websites. In high school, friends and I made a website for which we did zero search engine optimization (nor were we even aware of search engine optimization). We still ranked in relevant queries and easily got an audience of about ten thousand uniques a month.

Today, I struggle to get sites I make to show up on Google at all. For my most recent website, even searching for phrases that are unique to my website doesn't cause it to rank. This is more frustrating to me, because I have a Google ad for my website, which drives all of my traffic - so I know Google knows about my website and what keywords are relevant to it.

anotheracc · on Aug 13, 2019

Google has turned an Ocean into a pond - I think we'll all be the better for it once they're gone.

Regularly cannot turn up results that I know exist - a modest change that has no relation to the query meaning and the results often turn up.

This is not the Google Search Engine I remember.

avip · on Aug 13, 2019

I had to google turn an ocean into a pond as I was not sure what the phrase entails. I got few images of small cars and your comment. QED

buboard · on Aug 13, 2019

And googling it now brings up your comment!

drivebycomment · on Aug 14, 2019

I don't know how old you are, so it's hard to tell how long ago you're comparing. But how do you know if this isn't simply because there are a lot more content than it used to be ?

https://www.internetlivestats.com/total-number-of-websites/ indicates the number of web sites is still growing at breakneck speed.

Somewhat dated but still relevant:

https://searchengineland.com/googles-search-indexes-hits-130...

says the growth rate for the number of pages to be still substantial.

ummonk · on Aug 13, 2019

To your last sentence specifically though, I wouldn’t want Google to use ads data to influence non-ads rankings.

luckylion · on Aug 13, 2019

> Whereas Google was previously a way for sites to be discovered and for sites to generate revenue, it is increasingly becoming the sole source system where data is scraped and imported into Google, and Google keeps all of the revenue to itself.

I wondered yesterday: if you provide microdata, Google scrapes it, and you later decide to remove your sites from Google - is Google allowed to keep the microdata and continue to publish it?

michaelmrose · on Aug 13, 2019

If you once communicated the fact that 2+2 is 4 and Joe makes very good spaghetti you own the copyright to the text you published but neither the fact nor the opinion belongs to you in any meaningful way nor should it.

luckylion · on Aug 13, 2019

That's true, but a collection of facts ("a database") falls under copyright.

wongarsu · on Aug 13, 2019

Not sure why you are downvoted. In the EU databases fall under copyright, which indeed leaves the question how google legally deals with this (technically database right isn't copyright, but in this context that's a technicality).

Also, to quote from the Wikipedia article [1]: An owner has the right to object to the copying of substantial parts of their database, even if data is extracted and reconstructed piecemeal

1: https://en.wikipedia.org/wiki/Database_right

Dylan16807 · on Aug 13, 2019

> Not sure why you are downvoted.

Because they didn't say "in the EU", and it not being copyright is not just a technicality. Copyright is about creative expression, and utilitarian collections of facts aren't.

wongarsu · on Aug 14, 2019

> Because they didn't say "in the EU"

They also didn't say "in the US". From context you can only assume "in some jurisdiction google cares about"

> Copyright is about creative expression

That's not true, or at least a very US-centric view. The Berne Convention, the international standard for copyright, reads:

"[...] shall include every production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression, such as books, [...] works expressed by a process analogous to photography; works of applied art; illustrations, maps, plans, sketches and three-dimensional works relative to geography, topography, architecture or science."

also

"Collections of literary or artistic works such as encyclopaedias and anthologies which, by reason of the selection and arrangement of their contents, constitute intellectual creations shall be protected as such"

That's lots of things that are not exactly "creative expression" (even though exceptions for pure statements of fact do exist).

https://en.wikipedia.org/wiki/Berne_Convention

https://wipolex.wipo.int/en/text/283698

Dylan16807 · on Aug 14, 2019

"by reason of the selection and arrangement of their contents"

If there was no selection or you make the original selection irrelevant, while also giving your own arrangement, then there's no violation of copyright.

michaelmrose · on Aug 13, 2019

https://www.bitlaw.com/copyright/database.html#data

This doesn't provide any protection for the underlying facts.

jerf · on Aug 13, 2019

No, but if Google imports a database then they are still affected by the compilation copyright. It's too obvious of a hack to "just" claim that "yeah, we imported that entire database, but then we cracked all the facts apart and they're all separate now and it's just as if we never imported the database". That's not how the law works.

Even more interestingly, we've still never yet resolved the question of why Google gets to lift your entire site's contents and re-serve them in arbitrary ways to their own profit in the first place. It's really just a thing that happens on the internet because it was happening on the internet before the lawyers got there. I've said before and still believe that if there was no such thing as a search engine and they were just invented today, they'd be annihilated in court as nothing but one big copyright violation.

michaelmrose · on Aug 15, 2019

I'd rather give up copyright than search engines. Anyone who wants to push too hard ought to consider whether an entire nation might make the same choice.

TheSoftwareGuy · on Aug 13, 2019

IANAL, but as long as google is only distributing the individual facts (not the database of facts) they would be in the clear, legally

lonelappde · on Aug 13, 2019

Removal is irrelevant because Google doesn't rely on a license for its index.

robots.txt is a courtesy, not a legal obligation.

ocdtrekkie · on Aug 13, 2019

I am not sure there's a specific copyright applicable there. You can ask Google to remove your website's data from their index (primarily via robots.txt)... but of course, that also delists you from search. Essentially Google has left the impossible choice to either let them steal your data for free or accept not being findable in the primary search engine on the Internet.

luckylion · on Aug 13, 2019

Yeah, but if you don't get any clicks, that choice is no longer impossible: you're providing value without getting any in return.

Granted, it's still a while away to get into that territory, I think most sites still profit from Google.

Iv · on Aug 14, 2019

I think most of my zero-click search are a quick question on something you can find at wikipedia.

Well, I don't see the problem at google providing a cache to save WP's bandwidth. I block ads anyway...

4ntonius8lock · on Aug 13, 2019

Honestly, the second google started owning and promoting their own properties on search results, anti-trust regulators should have jumped on it.

I'm really not sure how Google can be protected by Section 230 and at the same time control and publish so much directly. Last time I read an article on the topic, google controls 23% of the top 100 sites.

IfOnlyYouKnew · on Aug 13, 2019

I've heard a decent amount of wrong takes on Section 230. But this is the most bizarre, yet.

Neither the CDA, nor section 230 specifically, create the sort of publisher/platform dichotomy people seem to be hung up on.

And Section 230 does exactly the opposite of what people commonly think it does. It's actually right there, in the text:

No provider or user of an interactive computer service shall be held liable on account of (a) any action voluntarily taken in good faith to restrict access to or availability of material that the provider or user considers to be obscene, lewd, lascivious, filthy, excessively violent, harassing, or otherwise objectionable [...]

That seems really easy to understand: you can delete nazi propaganda, porn, bad jokes, or just random user content from your platform without running the risk of thereby assuming liability for the rest.

ocdtrekkie · on Aug 13, 2019

But the inherently problem is that they need to be held liable for the rest. Because the rest is often criminal activity of which those platforms are making a profit. Perhaps removing Section 230 isn't enough, by your definition, because we need a law that explicitly holds platforms liable for profits generated from illegal activity.

icebraining · on Aug 13, 2019

Section 230 didn't prevent Google from being fined $500M for publishing ads for Canadian pharmacies selling drugs to US citizens. What kind of illegal activities are you talking about, exactly?

ocdtrekkie · on Aug 13, 2019

Google regularly distributes malicious websites and malware through ads, and refuses to delist reported malicious websites. (Google Ads is the primary distributor of Windows PC malware today, if my customer support experience is any indication.) And the problem is that Google has a perverse incentive: More bidders for ads means higher bids. Since malicious actors raise the prices, they benefit from bad actors selling ads.

And the problem is that even if someone finally comes in and shuts those actors down, Google kept all the profit from the malicious activity. In order to incentivize Google to police it's ad platform, we need to implement a requirement which seizes all revenue from malicious advertising, retroactively when a malicious account is flagged/reported.

If Google is losing revenue on allowing bad actors on their ad platform, they'll be incentivized to quickly respond to reports and remove them so that legitimate ads, which they make money on, can have those ad slots.

icebraining · on Aug 13, 2019

More clicks on ads that lead to malware also leads to fewer clicks in the future, though.

Have you published data about this anywhere, like a list of reported links that were ignored?

4ntonius8lock · on Aug 14, 2019

> More clicks on ads that lead to malware also leads to fewer clicks in the future, though.

I really doubt this works this way. There's 3 assumptions here, for your scenario to play out, people must pass the following funnels:

1- The person notices the malware

2- The person associates the connection between the ad and the malware.

3- After making the connection they install a non-dummy adblocker. Dummy adblockers like the one by Eyeo whitelist google's ads while actually harming the competition. It benefits them! Note: if I look up adblock on google, uBlock is only mentioned on page 2 of google and only because it's mentioned as a competitor to adblock on a zdnet article. The first whole page is dedicated to the Eyeo plug in.

I'd say very few people will get through that funnel. My experience is that when my family and friends actually seek out help with their computers, they have let it go for years until the computer is a slow mess of malware, self installed spyware in the form of browser add ons and other crazy stuff.

I've actually known a person who buys a computer every couple of years when it 'gets slow' simply to avoid maintenance. The few that I know from IRL relationships that do use an adblock, mostly use adblock by Eyeo, simply because of the domain and ranking on google.

Nasrudith · on Aug 13, 2019

That is a curious definition of need. Are landlords liable in your world for renting apartments to people who run ponzi schemes?