Hacker News new | past | comments | ask | show | jobs | submit login
Google: Hide sites to find more of what you want (googleblog.blogspot.com)
222 points by ssclafani on March 10, 2011 | hide | past | favorite | 89 comments



Just tried it out with the manual block page at < http://www.google.com/reviews/t >

I notice that when I block a subdomain (eg, http://answers.yahoo.com/ ) the page actually shows the entire domain (yahoo.com) as blocked.

Is this just an error in the display, or does it actually only block based on domain? If the second, this significantly limits the usability, since I can't block < http://someobviouslinkspam.blogspot.com > without also blocking every blogspot.com site.


I just observed that for blogs like http://somespam.blogspot.com, it allows you to block somesmap.blogspot.com alone, not the whole blogspot domain. It is the same for wordpress also.


Having the same answers.yahoo.com issue here. Very weird.

Edit: Clicking on "show details" it says that answers.yahoo.com is blocked, so maybe they're only showing the base domain here. Still weird.


Click "Download as text file" -- only yahoo.com is included in the download. Based on the file contents (UNIX line endings), I'm guessing that's a raw-ish dump of the database contents.


Ah, good to see Google following our lead :-) (Disclaimer I work at Blekko and we've had this feature from launch, and we don't limit you to 500 sites either)

On a more serious note though, its nice to see Google validate our assertion, that un-modified Google search results are getting poorer and poorer. Not that they would actually say it directly like that of course. Now lets see if they are willing to drop over a million crappy sites out of their index ...


FWIW, in my experience one of the more difficult aspects of starting and growing a company is knowing if what you believe is true, or just something you want to be true.

So you go to start a company and you say that you believe the search experience is getting worse and worse. That you believe the leader in the market place is fighting a losing battle of trying to algorithmically determine relevance in a world where humans will actively work to subvert the algorithms. That the world would be a better place if a more customizable search experience existed.

A common response is "Well if this is such an important idea they would already be doing it."

So as an entrepreneur you deal with the naysayers and you build something which other people have "... been clamoring for a blacklist for years" and yet has never materialized. And you put it out into the market.

Right up until you deliver your product, nobody knows who is 'right' in this back and forth. It is all academic. You've built the best product you know how to build, you offer it up to the world, and when you ship the debate stops being academic and starts becoming empirical.

When the market leader adds features that mimic ones you've launched with several months ago, it can be simply timing (they may have been going to do it all along and it just happens that they did it now), it can be a response, it can be random I suppose. Regardless of why they did it, it seems to be an unequivocal damnation of pure algorithmic search.

As one of Blekko's founding concepts was that the search experience can be made better by human curation, I choose to interpret it as validation. Not that I expect Google to join in and adopt the "web search bill of rights" anytime soon of course.


I don't think it's fair to call this "an unequivocal damnation of pure algorithmic search." The search is still purely algorithmic. This just allows individual users to choose which results are interesting to them in particular. A particular human's (or group of humans') curation would not be a perfect fit for everyone either.

Similarly, I don't see the ability to choose whether you're doing an image search or a news search as a condemnation of algorithmic search either.


Perhaps its an overstatement, but you have to recall that Google's "I'm feeling Lucky" button was designed to be the one page they had deduced you were looking for, their stretch goal, a results page with one link. The one they know you're looking for.

Now they have backed off on that, with reasoning that it is really not truly possible given that the search "Sony XBR TV" might be you are looking for the manual for the TV or to buy the TV or something which isn't known, but the goal is always to have the result you're looking for on the first page. That has been their target and promise for years.

So by creating a feature, and putting it into production, to "block sites" it suggests to me that they are saying "Hey, we know our algorithm is failing and we have given up (perhaps temporarily) trying to fix it. We know it will put sites you don't want to see into the first page. Here is a manual tool to weed those out."

It seems like something of a philosophy change at least. Perhaps Matt will chime in.


I got to be honest. Blocking sites doesn't seem to be working that well for you guys.

Look at this search - "how to get pregnant" - that exact phrase gets 550K searches on google a month. My wife and I did this search last week actually.

Blekko's top 3 results are 1. get-pregnant-guide.com which is an affiliate site that points to pregnancymiracle.com - one of those crappy ebook sites where they have a bunch of testimonials and ask for you to pay $39.95 for their guide, 2. a digg post that has 1 digg which points to a hubpages article that no longer exists, 3. purelyfitness.com/how-to-get-pregnant which is also an affiliate site that points to pregnancymiracle.com.

Google's top sites are the mayoclinic.com, mahalo.com, and howtogetpregnant.net. The least useful to me was howtogetpregnant.net, but they still gave a decent amount free advice. I actually found Mahalo to be the most informative which is blocked from your results. I find Calacanis just as annoying as everyone else, but this page was more beneficial to me than even the mayoclinic.


You are right, those results suck. However when you type 'how to get pregnant /health' those results seem to be pretty good.


The suggested slashtag for [how to get pregnant] is /pregnancy ... those results are pretty nice, albeit less medically-focused than the /health results.


As one of Blekko's founding concepts was that the search experience can be made better by human curation, I choose to interpret it as validation. Not that I expect Google to join in and adopt the "web search bill of rights" anytime soon of course.

I'm not sure if "Human curation" is the concept I'd grope for here.

What I'd like is search that uses a "larger context" - what I've searched-for and accepted/rejected, what those similar to me have searched-for and accepted/rejected etc. Being able to implicitly use a lot of information is crucial because there aren't going to be enough curators to everything ever.

Google is already using an incredible amount of implicit information and I expect them to use more.

What I see opportunity as really, is more like "self-curation" than some finite stable of curators. What would be cool is a system to let me keep my list of likes and search-finds - essentially my entire preference-system - and let me edit and send the whole thing to different search providers as well as to friends.


I just lost a little of respect from you guys with this comment.


I didn't, I think its true. I'm shocked google actually did this, it will cost them money in advertising dollars, and I honestly thought some of their technical decisions were being driven by maximizing revenue at almost any cost, so this move seems to invalidate that theory. Kudos to google I say.


Losing users will also cost them money in advertising dollars.


SEO spam has no actual users bro.


I mean Google. They'll lose users if they don't do something about the SEO spam.


I completely agree. They must improve (or at least) maintain the quality of their search results to keep users. The fact that it is the most popular search engine is what brings in advertising dollars in the first place. So, the users are the key. SEO spam hurts them, just like it annoys us, so this is one of the attempts to fight it.


You can afford not to limit users to 500 because no-one cares about gaming your system. Spammers are where the traffic is, so Google can't do that. But 500 sites are more than sufficient for me..


one man's crappy site is another man's precious precious wikipedia (see: all the disagreements that have already come up here over what to block).

Personally I prefer knowing that those sites are in the index, but don't show up unless I purposefully widen my search to include them (or they appear so low in rankings that it achieves the same effect).


I 100% agree. When you say "Here" are you talking about this thread or other threads? I would like to see them.


For now, they are not admitting defeat:

"In addition, while we’re not currently using the domains people block as a signal in ranking, we’ll look at the data and see whether it would be useful as we continue to evaluate and improve our search results in the future."


Now just fix your search engine so your results aren't total trash and you'll be all set.


Most users don't even block 1 site let alone 500.


Wow. Can't believe how quickly this was added to Google Search, rather than it remaining a Chrome Extension for a long time. Glad to see it, though!


I think one of Google's "secret" weapons is their incredibly fast innovation and time-to-market. Their rivals spend years building a product in secret (e.g., Windows) and then release a giant binary blob to the world whether the world wants it or not. Google's fast-paced changes to their core product are a huge part of what keeps them on top.


One of the advantages ex-googlers cite as a reason for joining facebook is that facebook is much faster at deploying new things. Google has been nortiously slow with most things. As per the windows comparison, Microsoft releases updates for their windows products almost daily so I dont see much difference.


> Microsoft releases updates for their windows products almost daily so I dont see much difference.

Sure, but these are minor updates. Look at the major updates. Windows Vista was developed for about 5 years in secret, and when it was eventually released to the world it was a disaster. To keep with the OS comparison, Google has been releasing major OS upgrades every 6 months - 1 year. Android is moving so much faster than Windows or even WP7 and is a huge part of why it's successful. In the time that WP7 has been out (five months? ish?) there have been 0 updates to most phones; maybe a few phones got the small update-to-the-updater. With that same period of time Google has announced and released a new Android version.


Yes, but you compared google adding a new link to their search results to Microsoft releasing new versions of windows. That comparison doesnt make sense; however, the small updates Microsoft makes daily is a valid comparison. You're off on a different argument now, regardless.


I was trying to generalize a bit with their strategies rather than make specific one-to-one comparisons. When Google rolls out a search results page change, potentially hundreds of millions of people see the effect immediately. When Microsoft pushes updates to Windows, there is rarely new functionality added and few people actually notice day to day. Most definitely not all Windows installations get the updates daily (whatever happened to Patch Tuesday? Does MS actually push out daily updates to their customers? I'd be very impressed if this is true).

Anyway, all I was really trying to say is that Google is, in general, faster to react to market changes, faster to push updates, and faster in developing new features when compared to Microsoft. And that's one of their advantages. Sorry if my specific examples became the primary discussion here, I really only meant the Windows comparison to be an example showing a larger strategy.


But Bing does new features at a pace on par, if not faster, than Google search.

It might be the case that different types of products have different release cycles. Not strictly though... IE is way slower than Chrome to rev.


I've always had the impression that's because the IE team is kind of paralyzed by a fear of breaking the Web.


Google hires more people per week than left google to join facebook, ever. So I don't think slow decisions are that big of a problem at Google, and seeing their recent moves I think they're pretty determined at staying the agile company easy to be when you're small..


Microsoft also hires more people per week than have left Microsoft to join Facebook, ever.


They're fast at building stuff when it isn't social.


Truer that you realize: this is a reverse Like button. It's antisocial.


Yeah, but they really have their pants down with this whole DroidDream patch.


"Quickly"?? People have been asking for this for years.


I've been clamoring for a blacklist for years, and they did used to have an [X] button on SERPs, so I imagine it's been being worked on for some time now.


It would be interesting to see the percentage of Google users that that block even one site. Though even if it's a small percentage it could help the blockers take out their search frustrations by blocking a site, and help the non-blockers by giving Google hints as to what searchers don't like.

The magical optimization I would prefer would be a non-commercial search. If I search for a piece of gear sometimes I don't want to buy it and instead want to weigh buying it or just look up reference information. For some searches that is tough, and permanently blocking commercial sites isn't an option. (I've occassionally resorted to limiting my searches to .edu and .org domains with limited success). Even temporarily blocking commercial sites might not help, though, since sites like Amazon.com have fantastic reviews on some items.


"by giving Google hints as to what searchers don't like."

Crowd-sourced curation -- this must be Google's eventual goal for the "blocking" feature. Can't content farms be thought of as spam? And, if so, don't the same techniques for spam identification apply, especially the "mark as spam" button in Gmail?

[Conjecture and speculation below as there's no evidence Google is going to modify search results based on users' blocking propensities, but I think the possibilities are worthy of consideration.]

I wonder how this scheme could be gamed? Consider how one might game spam filters to cause a target's future mailings to end up marked as spam. To hit a high-volume target, one would have to cause a ton of email addresses under his control to be placed on the senders distribution list. Then, when messages arrived, he would mark them all as spam. Although the total percentage of the sender's mailing list comprised of the bad guy's email addresses might be small, this technique would still be effective because the percentage of "activist" recipients required to flag any sender as a spammer is so low (at least I'm told). To address this problem, Google could certainly apply schemes similar to those used to identify click fraud. Such countermeasures would require the bad guy to make his attack appear more organic. Have there been actual instances of such denial-of-service attacks?

With a similar strategy, could, say, an anti-abortion group enlist its membership to banish pro-abortion groups' websites from search results? What if members were instructed to search for phrases like "How to get an abortion" and block Planned Parenthood, etc.? The difference between websites and email lists is that the owner of an email list has direct control and could take countermeasures like changing mail servers, etc. Also, large bulk mailers (MailChimp, etc.) generally have good relationships with large email box providers like Google and can therefore plead their clients' cases directly to a responsible person. How would pro-abortion groups get attention from Google if their site rankings dropped due to such grassroots organizing? Would Google be able to identify this as an orchestrated effort?

How would the behavioral patterns of pro/anti-abortion activists be any different than this forum's treatment of Experts Exchange? Google will observe that one small segment of its userbase blocks the site in high numbers but another segment, when searching for the same terms (MySQL failure XYZ) click on Experts Exchange results? Wouldn't users actually seeking information on how to get an abortion recognize the name "Planned Parenthood" and click on those same results which were blocked by activists?


Perhaps the solution would be personalised weightings for search results. You could then build a weighting graph based on blacklists, e.g. if I have a pro-abortion site on my list and it's found on someone else's blacklist then I would have a negative weight for the other sites on their blacklist. Maintaining the graph and weightings would be massively expensive though so I'm not sure when it would be viable.


For me, this is frustrating because it's not that useful in general search results, but it would be extremely useful in pruning Custom Search Engine results - and that's exactly where it isn't.

I want a one-click way to ban a domain from search engine results because most blacklisted domains in my Wikipedia search engine (http://www.google.com/cse/home?cx=009114923999563836576:1eor...) are porn or filesharing sites. One recognizes such spam in an instant, but it still takes a while to prune down the URL to the right domain, flip to the edit tab, paste it in, flip back, and relocate myself.

I'm not kidding when I say such a one-click button would cut by at least half the time I have to put into cleaning up the CSE results for any given query.


As an experiment, I thought I'd block FoxNews from my results to see if it would stop coming up in my "google news" aggregator.

Sadly, it does not. Would have been neat though.


Google News' Settings allows you to configure "more news from" and "less news from" by publisher.

More news from:

  NPR
  The Economist
  Wired News
Less news from:

  FOXNews		
  FOXSports.com		
  Wall Street Journal
http://news.google.com/news/settings


Quick, block expertsexchange!


Regardless of how insanely irritating their site is, I've found my answer there.. about once or twice. But I wouldn't block them entirely..


Perhaps Google could introduce an "I'm Feeling Desperate" button for the times you just know your result lies within one of your previously blocked sites.


You could log out..


I've found answers there, too, but I've yet to encounter a situation where they're the only place an answer is found. Now that StackOverflow is around I suspect that's even less likely.


My problem usually, contrasting other comments here, is that I don't look at the URL of the link I'm clicking on but primarily at the title. Since a lot of questions on EE are very specific, I often click by accident since the question sounds a lot like what I'm looking to find an answer for, only to grunt at having to scroll all the down ("Ok, I'll give them one more try..") where no answer usually is found.


First thing I did!


Aside from the "big two" (experts exchange and Mahalo) I actually think the one I'll block first is Wikipedia. Truthfully, I really like Wikipedia as a resource and love to peruse the information there--but if I want the article from Wikipedia, I'll go to Wikipedia and look it up. If I'm searching google, it's probably for something that's not going to be covered well by Wikipedia anyway.

For example, let's say someone has suggested to me to use the factory pattern to solve a problem in a project and I'm not intimately familiar with that pattern. I search for "Factory design pattern" on Google and notice that the first 2 results are wikipedia results. There's some good, basic boiler plate--but that's not what I need (Ok, what I really need is probably a trusty copy of the "Gang of Four" so maybe it's a bad--or at least contrived example).


I disagree. Having wikipedia results at the top of my google searches has been extremely useful. It's easier to find a wikipedia page through google rather than search on the wikipedia site.


That's a fair point; though I generally use Google for, "I want something better than the encyclopedia treatment" on a topic.


Anyone have a list of "bigresource" like sources of bad programming tips? I have been using site:stackoverflow.com but would prefer just to block sites like this: http://mysql.bigresource.com/Fatal-error-Can-t-open-privileg...


Excellent. Thanks Google, you have just saved my sanity.

So can I disable the chrome plugin now? Will it remember the sites I have already blocked?


Check the block list on http://www.google.com/reviews/t . If it lists the pages you blocked through the extension, you'll be good to go. Otherwise you can always add them manually.


Good call. No, you have to start again. Oh well, not a big deal.

Note you can only block 500 sites, I should think thats enough though.


Many people already know what sites they don't want, and never open these links. But to block those links, do users really have to open a link and return to Google to block it?

Also, the sites that have already been blocked with the extension, will they be auto-blocked?


You could also just go here and add manually to your heart's content:

http://www.google.com/reviews/t


Agree - - would love to see a Chrome extension that will give me a one-click button to add the domain of the page I'm currently on (presumably after having clicked on it from a Google search) added to my block list.


I think it's to avoid accidental blocks.


Now I'd only wish people to be able to share lists with friends, just like you download blocking lists from adblock :)

But in the meanwhile good riddance Mahalo, eHow, Yahoo Answers and ExpertSExchange, etc


I wonder why Google prefers that you block a whole domain instead of a result (what searchwiki did).

As said in another comment I wouldn't be confortable to block the whole expertsexchange.com domain, I get revelant results sometimes, yet some of the results of the same website are so unhelpful I'm sure I don't want to see them again (auto approved solutions and stuff like that).

It also happens when I search for something I already searched for a while back. Some of the results remain irrevelant, but not necessarily the whole domain.


Because there are a raft of bad actors out there with junk sites. Removing them utterly from your personal existence can help make search much more useful. Examples:

- Yahoo! Answers

- Ezine Articles

- Those Efreedom assholes

- Experts Exchange (your experience aside, this place is crap for most everyone else)

- All the junk sites around appliance manuals

- All the other content farms


I think their reasoning is that if you really want to delete something from the results, it has to be really bad. Hence, you'll likely not want to receive search results from that same site.

But I'm no Matt Cutts.

Btw, is the link showing up for everybody? It is not for me.


Hopefully this will get combined with Google Alerts, so I have a one-click way to tell google that people are creating spam on topics I care about.


This seems like a great idea to block the really bad offenders, but doesn't seem granular enough for the sites that have inconsistent quality of content (i.e. a user generated content site can have both shallow/crappy content AND useful content under the same domain).

I wish we could give more contextual feedback, like... THIS link was helpful/relevant, THIS link was not.


It's not intended for use on sites with inconsistent quality. PageRank takes care of the individual pages on those sites.


How can that intent of purpose be conveyed to the user when they are blacklisting whole sites?

They aren't using these blacklisting as a signal for influencing the google algo, YET... but it seems inevitable.

I'm just trying to say it seems a little too much like an axe instead of a scalpel.

Personally I can't wait to start rage blocking domains that have pop over modals for surveys / advertising, or have sound enabled by default.


That's a fair question. Since they aren't using user blacklists as a signal yet I will punt. Were they to start, then I think you have a point that finer-grained data is preferred. They should probably offer both options in that case, i.e. "don't show this result anymore" and "don't show any results from bar.example.com anymore."


Maybe it's just me, but I switched to Bing a few weeks ago, and I'm finding a pretty good increase in search result quality. For me, good search by default > good search only with my help.

OTOH, I imagine that bing will also eventually succumb to content farms and other techniques that will come up, so maybe this is the way of the future.


If the spammers and content-farms were the Empire, this would be like blowing up the Death Star. Brilliant.


That's awesome. I'm glad they are giving the users more options. I searched for how to bake a cake to finally block ehow from my results. How good that feels. Trash sites are about to feel the sting I imagine. That reminds me, need to block about.com too.


What happens if you block Google?


Time-space breaks down. Don't do it!

The conspiracy theorist in me wonders if in a few months users are mysteriously going to have facebook.com added to their block list.


That'll just block Facebook from showing up in Google results which they never did in the first place unless you searched for someone's name.


<meta http-equiv="refresh" content="0;url=about:blank" />


Not sure why they are limiting this to particular browser versions. If I'm running IE7 (which I'm not), what does that have to do with which sites I want to block?


Maybe the way they detect you went back to the search result depends on the modern browser functionality.


Now, if they could only provide a way to disable previews...


Click the magnifying glass on any search-return item.


You know, maybe I'm just slow and don't see the big picture, but I don't understand what all the hubub is about.

The '-' operator has worked wonders for me for years.


Do you type "-mahalo.com -efreedom.net -allthespamblogsthatannoyme.com", every time you do a search?


I for one am looking forward to typing a technical question and then not getting hundreds of hits from those sites that want to charge money for the answer.

(And to head off the obvious karma steal at the pass - Because sometimes the answer isn't in stackoverflow yet. Duh)


I don't see this button on google.com.br

I hope it's propagated to international Google pages soon.


I have been waiting for this so I can hide Experts Exchange.


What's next? Build your own search results




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: