Hacker News new | past | comments | ask | show | jobs | submit login
Google’s Matt Cutts: Don’t Be The Sucker That Buys The Spammy Domain (searchengineland.com)
95 points by spacestronaut on April 11, 2013 | hide | past | favorite | 53 comments



All sounds well and good, until you realize that a "bad domain" in the Google-verse isn't limited to spammers, but also for instance domains from which something was published that the copyright-mafia didn't take kindly to. Or whatever other obscure reasons and methods Google may have to blacklist domains.

"Spammy" domains is a a smokescreen. The issue is lack of transparency and accountability of Google when it comes to penalizing domains. We should simple be able to check this, just like with any RBL.

It's highly inappropriate to call those who fall victim to this "suckers", but it illustrates once again how Google feels about the little people.


Why would Google publicize how and which domains are penalized? That's a terrible idea that would simply play into the hands of spammers and make their job easier.

At the end of the day, it's not Google's job to make your life easier. Their job is to deliver relevant SERPs and monetize them through ads. The extent to which they'll go out of their way to help you is the extent to which not doing so would hurt the relevance of their SERPs, and ultimately the consumer's trust therein.

There seems to be a tendency on HN to think of Google as the Guardian of the Internet, and ascribe powers and responsibilities to them commensurate with that role. They're not said guardians, though: they're a search engine and ad broker. They only protect the Internet out of self-interest, and expecting more of them is unreasonable and unfair for a publicly traded company.


> Why would Google publicize how and which domains are penalized? That's a terrible idea that would simply play into the hands of spammers and make their job easier.

If spam detection relies on obscurity of the signals used, then it isn't good spam detection and for every change it's only a matter of time until spammers adapt. If a trained human can recognize spam in its many forms, then it's possible for spam detectors to do the same without relying on obscurity.

Speaking on the subject, if a human can recognize that a domain is not owned by the same entity and doesn't have the same content it used to, then I don't see where's the problem in white-listing such a domain.

> At the end of the day, it's not Google's job to make your life easier

At the end of the day it's not your job to explain their reasoning or apologize for their screw-ups. If you're not a Google employee, then you've got no idea what their internal goals are or what their "job" is.

> They only protect the Internet out of self-interest, and expecting more of them is unreasonable and unfair for a publicly traded company

What I don't understand is why in the world should a publicly traded company be treated any different from a real person? If somebody behaves like a jackass, do you tolerate that person? If somebody betrays your trust, are you going to give a damn about that person's bottom line?

And forget about comparisons to people's interrelationships. Why should publicly traded companies be viewed any differently by consumers and clients? Why would I give a damn if a company is privately owned or public, as long as that doesn't make a difference to my own bottom line?


> If a trained human can recognize spam in its many forms, then it's possible for spam detectors to do the same without relying on obscurity.

You mean... it's theoretically, some-time-in-the-future possible? What, we are all to put up with terrible search results until such time as strong AI has been cracked? I think not. It is perfectly valid for Google's spam detection to "rely on obscurity" if that is the best currently-available solution to the problem, and your criticism on this point is not very reasonable.


If spam detection relies on obscurity of the signals used, then it isn't good spam detection and for every change it's only a matter of time until spammers adapt.

So? It works until they do adapt, then the cycle repeats itself. There's no such thing as perfect spam prevention. It will always be an arms race.

At the end of the day it's not your job to explain their reasoning or apologize for their screw-ups. If you're not a Google employee, then you've got no idea what their internal goals are or what their "job" is.

There's not much content for me to address, here. My statement was pretty self-evident.

If somebody behaves like a jackass, do you tolerate that person?

Google isn't behaving like a jackass. They're simply not going out of their way to do other people's work for them. It's not Google's responsibility to make sure your domain is kosher; that's yours.


I think bad_user is saying that Google decides the definition of kosher, doesn't tell you what that definition is, and changes it when they please.

If I'm reading this right, you are saying that's perfectly ok and if I get caught in the collateral damage, it's my problem and I shouldn't hold it against Google.


Google adapts what is kosher to match unethical behavior, yes. "When they please" makes it sound far more arbitrary, disconnected, and unnecessary.


> "Spammy" domains is a a smokescreen. The issue is lack of transparency and accountability of Google when it comes to penalizing domains. We should simple be able to check this, just like with any RBL.

Completely agree. We bought a domain from a long-time competitor (it was our registered brand name we've had 12 years, but for another TLD) and were unable to check beforehand whether there was a Google penalty associated with it and whether the competitor wanted to part with it because we offered so much money or he had suffered from a penalty and needed to get rid of it. We just had to risk it anyway, for obvious reasons.

The way Google keeps its data and methods secret helps it get away with weak heuristics and from many (not all) people gaming them, but it also hurts a lot of people with legitimate concerns.


I agree that there is a lack of transparency regarding penalties for spammy pages, but the copyright stuff (and censorship requests by the government) is clearly forced upon them. At least they release a transparency report about the nature and amount of censorship/copyright removals. (it also shows which party requested the removal, and what exactly was removed) I am not aware of any other search engine that releases this information.

http://www.google.com/transparencyreport/

Edit: Here's a list of all removal requests: http://www.google.com/transparencyreport/removals/copyright/...


Just an example: removing certain sites from their autocomplete feature was not forced upon them. They are not required to do this, and they are not transparent about it.

Their transparency is also limited to what is forced upon them, not what they do themselves, quietly and secretly. Also note that other properties, like the massively censored YouTube, are completely omitted from their transparency report (unless it ironically enough involves removing links to YouTube...).

The little bit Google is transparent about when it comes to what they omit from "organizing the world's information and make it universally accessible" (Google's mission statement) is only what suits them.

It's a marketing tool to sell the message "it's not our fault". Transparency and openness are not part of Google's DNA, that bit is just marketing.

It's not the flaws in the system that bother me, this is stuff is hard. It's the marketing-driven pretense of caring and at the same time arrogantly telling us "suckers" to go f* ourselves.


Here's a list of all removal requests

It's not.

http://www.google.com/transparencyreport/removals/copyright/...

"...It is a partial historical record that includes more than 95% of the volume of copyright removal requests that we have received for Search since July 2011. It does not include..."


Sounds like Google's problem.

"Remember folks! Don't use SEO! You're subverting das Googlebot!"

But play by all their algorithmic rules, or you may be randomly penalized.

This is called a False Positive. It's Google's fault, not the legitimate domain owner's. This is a hard problem(tm) for sure, but not an excuse.


Your post is wonderfully self-righteous but not very helpful or insightful, unlike some of the other comments below. This makes me sad. I hope more people vote the latter up some more.

The fact of the matter is that someone is always unhappy when it comes to spam control. If the restraints are too loose then suddenly everyone is screaming about how "spammers are ruining Google" and "Google is doomed because of spam" (this was very true right before the Panda rollout). However, if Google starts to crack down, then you get more false positives and more people upset for different reasons.

Unfortunately, no pattern recognition system runs at 100%. That's not a hard problem(tm), that's an impossibility. Relatedly, I don't really see what's wrong with assiging a "reputation" to a domain name. As the article states, it's possible to climb back up to "good" status, it just takes some extra work. Kind of like in real life.


I worked on Web-Based Reputation Scores for IronPort/Cisco. I know how hard this is, especially at scale. This is not about them having a reputation as a concept.

My point is that Google suggesting you register a different domain because their href-based reputation sucks over time is a cop-out. This specifically falls under the "Things that are Google's fault".

Matt's suggestion here is definitely practical, but it's a fault in how Google looks at domain reputation. When he talks about "having to renovate a domain" he's talking about you the consumer of a domain legally acquired and without spammy content hosted on it, having to play Google's algorithm, because their definition of "spam" is absolutely massive, vaguely defined, and with way too long of a ttl.

And with a massive, undefined, long term concept of spam, you're going to have ridiculous FPs. Which Google has, and is now recommending is somehow your responsibility.

This is entirely their fault, and it has to do with how they are determining reputation, not that they have reputation at all.


Not to mention (well, I just did) that with $50 or less anyone can point millions of spammy links to a $1 Million domain. Buy another $million domain now because Google can't manage that?


He is talking about on page content, not backlinks.


The same, if not worse applies to spammy backlinks.


While Google has certainly been pointing this out to people and telling them to get their backlinks in order, I am not sure that people are actually being penalised currently?


Didn't Google do this same sort of "recycling built trust in channel" when they bought Frommer's, gutted it, sold it back to the original owner, BUT kept the social media accounts Frommer's built up (with the old followers) and rebranded those followers into a new Google-owned Zagat Twitter account?

How is that behavior any less "bait-n-switch" (or influenced by money) than buying a site or renting some links?


Oh yes, Google's fitness function is profit, not accuracy.

We've just been lucky enough they're looking at profit on a long enough time scale that loyalty through product quality is a metric they've thought about.


How is it a false positive if there are thousands of spam links still pointing to the domain?


Because Google wants to get searchers to the best content for the query, not to the content with the fewest spammy links.

Usually they are correlated, not always though.


What's correlated?

Are you saying that pages with great content typically have few spammy links? If so, the jury is out on that thesis, as many of the most popular sites have hundreds to thousands of spam sites that scrape just about everything the do & many of those scraper spam sites do link to the original source. And just by ranking for valuable keywords, over time you will likely get some spammy inbound links from sites that are scraping the Google search results.


How is the typical person supposed to know that there are thousands of spammy backlinks to a site? As far as I know, Google doesn't provide any tools to discover this information. There is, of course, Google Webmasters Tools, but that would only apply after you have purchased the domain. It wouldn't do any good in this case.


There are a few third party indexes used by SEOs for this purpose:

http://www.opensiteexplorer.org/ https://www.majesticseo.com/ https://ahrefs.com/

Even once you own the domain, the links Google reports in GWT can be rather hit and miss...


>> "How is it a false positive if there are thousands of spam links still pointing to the domain?"

I can accuse you of being a murderer but the police aren't going to arrest or ruin your (financial) life automatically.

Give me your site name and I'll give you a $5 gift, a gift that keeps giving for life with Google :) http://fiverr.com/gigs/search?utf8=%E2%9C%93&query=senuk...

Google sure loves to have all that power because it can translate it into tens of $ billions. Minus the responsibility of course because that costs money.


How does one accurately identify a 'spammy' domain without insight into google's data?

Obviously you can check the RBLs for email related spam issues but Google does not expose any interface that would allow you to accurately determine a domain was used for spam in the past.

The best way to build a company on the web is to build one that does not overly rely on single traffic sources to begin with, and that imo includes google. Lest the next update of their algo puts you out of business.


The hard part about building a sustainable publishing business model online that avoids the largest single channel is that margins matter & if you have 5% margins or such, a competitor that clones your model AND is in Google's good graces can use their search-driven profits to noise up other channels and drive your margins into negative territory. (In social media competitors can target ads at just your followers, competitors can buy your branded keywords in Google, etc. ... most of the big platforms sell access across that way.)

It is not uncommon to see market participants take profits from one line of business or one channel & use it to undermine competitors who were succeeding in other channels.

So lets say you avoid search entirely...how do you sustainably compete against a competitor who has a similar footprint to you AND is leveraging search-driven profits to come after you?


Good point, you need to do everything right. But I think my point about over-reliance on a single source of traffic (and thus, indirectly of income) stands.


The spam that's being referred to is link spam i.e buying large numbers of links, comment spam etc in order to boost the target domain's search rankings. There's no one tool that will give you an exact answer regarding the domains penalty status. Putting the domain in something like ahrefs.com and looking at the linking domains and the anchor text section should give you a flavour though. If it's using spammy links you'll see things like lots of links in (poor quality) comments coming into the domain, links from unrelated or poor quality sites, lots of links targeting a specific phrase. If you see any links targeting pharmaceuticals or gambling that's usually a bad sign. Obviously you have to use your judgement when looking at these kind of reports - no one tool will be seeing the same dataset that google sees so it's possible it's missed a lot of links also legitimate sites suffer from scraping and apear on spammy sites so you have to look at the link profile as a whole rather than the odd bad link.


I suppose you could google the domain and see what kind of hits you get, if any.


Except aren't the "spammy" links the ones Google doesn't show in search results? So is your prospective domain not showing up much in Google because there aren't many links to it, or because there are tons of spammy links to it?


The best way to build a company on the web is to build one that does not overly rely on single traffic sources to begin with, and that imo includes google. Lest the next update of their algo puts you out of business.

The best way would be for people to speak up about search engines, what they are doing and to give newcomers a certain benefit of the doubt so we get some diversity in online search.

Google controls as much as 90% of online search and spends accordingly to maintain and increase that share. It's hard to run a business without 90% of people.


It is really scary. In Germany they have about 100% of my audience. Bing sends me about as many visitors as a single referrer link on some Q&A site. If Google ever closed our Adwords account we would just die. At least I pay them lots of money so they are probably not really interested in shutting us down.


http://en.wikipedia.org/wiki/Gentrification

Internet slums?

I wonder if this would bother highly-viral pop-up sites at all. Muxtape would have probably set a spammy domain positive in like a day. But, such is the nature of a transitional neighborhood—no corporations, just tons of art boutiques.


One possible solution for a bad backlink profile: drop the registration on the domain then re-register with new info.

Google claims that they will "wipe" a backlink profile if they see a domain drop and later re-registration by a new owner.

That wasn't always the case. I think as recently as 3-4 years ago buying dropped domains for backlinks was a big tactic.


How do you prevent someone else from picking it up when dropped?


it would be nice if google gave us a decent tool to check domains before we bought them. Currently they hide most (>90%) of back links.


That would be an incredible gift to spammers. Just keep spamming until you see your domain health go from green to yellow. Then you can calibrate just how much you can get away with.

It's a nice idea, but will just make the spam problem worse.


it's usually pretty obvious to spammers when they get penalized. their traffic drops and if they use webmaster tools (admittedly ill-advised and unlikely) they get an explicit message telling them they've been caught.


Yeah, perhaps something like a domain report card highlighting any problems their system has currently flagged for it and linking to solutions. Would be useful even on domains you've owned for a long time, just to help determine if there are any concerns.

For example, when Google first started the Panda updates, I read that some sites were being targetted via paid link buys NOT purchased by the actual owner. Would be nice if there was a tool that let the owner find out that was what was happening to his site.


The Penguin update targeted spam links not Panda (the Panda update penalized thin/spun/lame content). It is confusing...they clearly don't support color diversity :).... In webmaster tools you can report incoming links that are spam. Google has along way to go. Doing a backlink analysis before a domain purchase is time consuming and most people don't know how to do it.


Haha, thanks for the heads up. I knew it was one of the P animals :)


Spam or any other phony domain is dangerous to purchase. Such domain could have been serving root document with cache expiration date long in a future that iframes index.html (so the site looks normal) and includes a malicious js controlled by the previous domain owner.


It's worth noting that the converse is also true. If you want to rank easily, buy a domain that wasn't renewed that used to have good SEO - I can fairly regularly pick up sites with a few thousand white-hat links and and a 4 or 5 PageRank, restore them to the old cache, and start from there, for a couple hundred dollars. Those links would cost tens of thousands to create.

There are probably a lot of people that would be pissed at me for displaying that information publicly, but oh well.


Does this still work? I know this was the rage back in ~2005. But then google started to check for registration continuity and drop juice from old links if there was a large gap in the whois records.


It still works. I wrote this[0] near the beginning of 2012; when I wrote it, it was a mystery, but I've since heard from a Google engineer that this happened because the Yellow Pages people purchased a domain that ranked highly for the relevant keywords. In other words, you can buy pagerank.

[0]http://lee-phillips.org/hitchYellowPages/


Similar to buying a dedicated host, and the previous host was spamming emails and the IP is blacklisted.


I had a SEO person quote us $2K to get our ranking up and declined but know another person in our vertical that used them and is in the top 10 now on organic search.

One of the problems of creating good content is that we are still competing with SEO spammers even if its short term.


So the competitor who paid the SEO is a spammer, the person who paid Google AdWords is an angel & the person who has no exposure has no traffic nor revenues, but is virtuous with great content (that almost nobody cares about other than the owner)?


?


Always collect emails of customers and invite visitors to opt-in to your mailing list. This way if Google will change it's mind the 1,000,001-st time about ranking you - you will still have direct communications channel opened with your prospects and customers without Google blessing.


Google wants every site to buy ads, so they engage in everything possible from adding dozens of ads on pages, to demoting entire niches, to FUD to force you to advertise. When Google has ads for a certain keyword, they don't care much about "organic" results, they care that ads are better. In fact, several threads at Webmaster World have shown a lot of obvious examples where Google makes results worst to have a better adclick ratio. Everything Google does is to make money via ads, from Chrome to Android to Search updates, especially during Larry Page's rein. That is the bottom line, the rest is fluff.

Google Shopping is already 100% ads, the rest of the commercial results is 99% ads and advertiser sites (usually huge brands that spend billions on Adwords). Disclosure and ethics however aren't Google's strongest points, lobbying governments is.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: