All sounds well and good, until you realize that a "bad domain" in the Google-verse isn't limited to spammers, but also for instance domains from which something was published that the copyright-mafia didn't take kindly to. Or whatever other obscure reasons and methods Google may have to blacklist domains.
"Spammy" domains is a a smokescreen. The issue is lack of transparency and accountability of Google when it comes to penalizing domains. We should simple be able to check this, just like with any RBL.
It's highly inappropriate to call those who fall victim to this "suckers", but it illustrates once again how Google feels about the little people.
Why would Google publicize how and which domains are penalized? That's a terrible idea that would simply play into the hands of spammers and make their job easier.
At the end of the day, it's not Google's job to make your life easier. Their job is to deliver relevant SERPs and monetize them through ads. The extent to which they'll go out of their way to help you is the extent to which not doing so would hurt the relevance of their SERPs, and ultimately the consumer's trust therein.
There seems to be a tendency on HN to think of Google as the Guardian of the Internet, and ascribe powers and responsibilities to them commensurate with that role. They're not said guardians, though: they're a search engine and ad broker. They only protect the Internet out of self-interest, and expecting more of them is unreasonable and unfair for a publicly traded company.
> Why would Google publicize how and which domains are penalized? That's a terrible idea that would simply play into the hands of spammers and make their job easier.
If spam detection relies on obscurity of the signals used, then it isn't good spam detection and for every change it's only a matter of time until spammers adapt. If a trained human can recognize spam in its many forms, then it's possible for spam detectors to do the same without relying on obscurity.
Speaking on the subject, if a human can recognize that a domain is not owned by the same entity and doesn't have the same content it used to, then I don't see where's the problem in white-listing such a domain.
> At the end of the day, it's not Google's job to make your life easier
At the end of the day it's not your job to explain their reasoning or apologize for their screw-ups. If you're not a Google employee, then you've got no idea what their internal goals are or what their "job" is.
> They only protect the Internet out of self-interest, and expecting more of them is unreasonable and unfair for a publicly traded company
What I don't understand is why in the world should a publicly traded company be treated any different from a real person? If somebody behaves like a jackass, do you tolerate that person? If somebody betrays your trust, are you going to give a damn about that person's bottom line?
And forget about comparisons to people's interrelationships. Why should publicly traded companies be viewed any differently by consumers and clients? Why would I give a damn if a company is privately owned or public, as long as that doesn't make a difference to my own bottom line?
> If a trained human can recognize spam in its many forms, then it's possible for spam detectors to do the same without relying on obscurity.
You mean... it's theoretically, some-time-in-the-future possible? What, we are all to put up with terrible search results until such time as strong AI has been cracked? I think not. It is perfectly valid for Google's spam detection to "rely on obscurity" if that is the best currently-available solution to the problem, and your criticism on this point is not very reasonable.
If spam detection relies on obscurity of the signals used, then it isn't good spam detection and for every change it's only a matter of time until spammers adapt.
So? It works until they do adapt, then the cycle repeats itself. There's no such thing as perfect spam prevention. It will always be an arms race.
At the end of the day it's not your job to explain their reasoning or apologize for their screw-ups. If you're not a Google employee, then you've got no idea what their internal goals are or what their "job" is.
There's not much content for me to address, here. My statement was pretty self-evident.
If somebody behaves like a jackass, do you tolerate that person?
Google isn't behaving like a jackass. They're simply not going out of their way to do other people's work for them. It's not Google's responsibility to make sure your domain is kosher; that's yours.
I think bad_user is saying that Google decides the definition of kosher, doesn't tell you what that definition is, and changes it when they please.
If I'm reading this right, you are saying that's perfectly ok and if I get caught in the collateral damage, it's my problem and I shouldn't hold it against Google.
> "Spammy" domains is a a smokescreen. The issue is lack of transparency and accountability of Google when it comes to penalizing domains. We should simple be able to check this, just like with any RBL.
Completely agree. We bought a domain from a long-time competitor (it was our registered brand name we've had 12 years, but for another TLD) and were unable to check beforehand whether there was a Google penalty associated with it and whether the competitor wanted to part with it because we offered so much money or he had suffered from a penalty and needed to get rid of it. We just had to risk it anyway, for obvious reasons.
The way Google keeps its data and methods secret helps it get away with weak heuristics and from many (not all) people gaming them, but it also hurts a lot of people with legitimate concerns.
I agree that there is a lack of transparency regarding penalties for spammy pages, but the copyright stuff (and censorship requests by the government) is clearly forced upon them.
At least they release a transparency report about the nature and amount of censorship/copyright removals. (it also shows which party requested the removal, and what exactly was removed)
I am not aware of any other search engine that releases this information.
Just an example: removing certain sites from their autocomplete feature was not forced upon them. They are not required to do this, and they are not transparent about it.
Their transparency is also limited to what is forced upon them, not what they do themselves, quietly and secretly. Also note that other properties, like the massively censored YouTube, are completely omitted from their transparency report (unless it ironically enough involves removing links to YouTube...).
The little bit Google is transparent about when it comes to what they omit from "organizing the world's information and make it universally accessible" (Google's mission statement) is only what suits them.
It's a marketing tool to sell the message "it's not our fault". Transparency and openness are not part of Google's DNA, that bit is just marketing.
It's not the flaws in the system that bother me, this is stuff is hard. It's the marketing-driven pretense of caring and at the same time arrogantly telling us "suckers" to go f* ourselves.
"...It is a partial historical record that includes more than 95% of the volume of copyright removal requests that we have received for Search since July 2011. It does not include..."
Your post is wonderfully self-righteous but not very helpful or insightful, unlike some of the other comments below. This makes me sad. I hope more people vote the latter up some more.
The fact of the matter is that someone is always unhappy when it comes to spam control. If the restraints are too loose then suddenly everyone is screaming about how "spammers are ruining Google" and "Google is doomed because of spam" (this was very true right before the Panda rollout). However, if Google starts to crack down, then you get more false positives and more people upset for different reasons.
Unfortunately, no pattern recognition system runs at 100%. That's not a hard problem(tm), that's an impossibility. Relatedly, I don't really see what's wrong with assiging a "reputation" to a domain name. As the article states, it's possible to climb back up to "good" status, it just takes some extra work. Kind of like in real life.
I worked on Web-Based Reputation Scores for IronPort/Cisco. I know how hard this is, especially at scale. This is not about them having a reputation as a concept.
My point is that Google suggesting you register a different domain because their href-based reputation sucks over time is a cop-out. This specifically falls under the "Things that are Google's fault".
Matt's suggestion here is definitely practical, but it's a fault in how Google looks at domain reputation. When he talks about "having to renovate a domain" he's talking about you the consumer of a domain legally acquired and without spammy content hosted on it, having to play Google's algorithm, because their definition of "spam" is absolutely massive, vaguely defined, and with way too long of a ttl.
And with a massive, undefined, long term concept of spam, you're going to have ridiculous FPs. Which Google has, and is now recommending is somehow your responsibility.
This is entirely their fault, and it has to do with how they are determining reputation, not that they have reputation at all.
Not to mention (well, I just did) that with $50 or less anyone can point millions of spammy links to a $1 Million domain. Buy another $million domain now because Google can't manage that?
While Google has certainly been pointing this out to people and telling them to get their backlinks in order, I am not sure that people are actually being penalised currently?
Didn't Google do this same sort of "recycling built trust in channel" when they bought Frommer's, gutted it, sold it back to the original owner, BUT kept the social media accounts Frommer's built up (with the old followers) and rebranded those followers into a new Google-owned Zagat Twitter account?
How is that behavior any less "bait-n-switch" (or influenced by money) than buying a site or renting some links?
Oh yes, Google's fitness function is profit, not accuracy.
We've just been lucky enough they're looking at profit on a long enough time scale that loyalty through product quality is a metric they've thought about.
Are you saying that pages with great content typically have few spammy links? If so, the jury is out on that thesis, as many of the most popular sites have hundreds to thousands of spam sites that scrape just about everything the do & many of those scraper spam sites do link to the original source. And just by ranking for valuable keywords, over time you will likely get some spammy inbound links from sites that are scraping the Google search results.
How is the typical person supposed to know that there are thousands of spammy backlinks to a site? As far as I know, Google doesn't provide any tools to discover this information. There is, of course, Google Webmasters Tools, but that would only apply after you have purchased the domain. It wouldn't do any good in this case.
Google sure loves to have all that power because it can translate it into tens of $ billions. Minus the responsibility of course because that costs money.
How does one accurately identify a 'spammy' domain without insight into google's data?
Obviously you can check the RBLs for email related spam issues but Google does not expose any interface that would allow you to accurately determine a domain was used for spam in the past.
The best way to build a company on the web is to build one that does not overly rely on single traffic sources to begin with, and that imo includes google. Lest the next update of their algo puts you out of business.
The hard part about building a sustainable publishing business model online that avoids the largest single channel is that margins matter & if you have 5% margins or such, a competitor that clones your model AND is in Google's good graces can use their search-driven profits to noise up other channels and drive your margins into negative territory. (In social media competitors can target ads at just your followers, competitors can buy your branded keywords in Google, etc. ... most of the big platforms sell access across that way.)
It is not uncommon to see market participants take profits from one line of business or one channel & use it to undermine competitors who were succeeding in other channels.
So lets say you avoid search entirely...how do you sustainably compete against a competitor who has a similar footprint to you AND is leveraging search-driven profits to come after you?
Good point, you need to do everything right. But I think my point about over-reliance on a single source of traffic (and thus, indirectly of income) stands.
The spam that's being referred to is link spam i.e buying large numbers of links, comment spam etc in order to boost the target domain's search rankings. There's no one tool that will give you an exact answer regarding the domains penalty status. Putting the domain in something like ahrefs.com and looking at the linking domains and the anchor text section should give you a flavour though. If it's using spammy links you'll see things like lots of links in (poor quality) comments coming into the domain, links from unrelated or poor quality sites, lots of links targeting a specific phrase. If you see any links targeting pharmaceuticals or gambling that's usually a bad sign. Obviously you have to use your judgement when looking at these kind of reports - no one tool will be seeing the same dataset that google sees so it's possible it's missed a lot of links also legitimate sites suffer from scraping and apear on spammy sites so you have to look at the link profile as a whole rather than the odd bad link.
Except aren't the "spammy" links the ones Google doesn't show in search results? So is your prospective domain not showing up much in Google because there aren't many links to it, or because there are tons of spammy links to it?
The best way to build a company on the web is to build one that does not overly rely on single traffic sources to begin with, and that imo includes google. Lest the next update of their algo puts you out of business.
The best way would be for people to speak up about search engines, what they are doing and to give newcomers a certain benefit of the doubt so we get some diversity in online search.
Google controls as much as 90% of online search and spends accordingly to maintain and increase that share. It's hard to run a business without 90% of people.
It is really scary. In Germany they have about 100% of my audience. Bing sends me about as many visitors as a single referrer link on some Q&A site. If Google ever closed our Adwords account we would just die. At least I pay them lots of money so they are probably not really interested in shutting us down.
I wonder if this would bother highly-viral pop-up sites at all. Muxtape would have probably set a spammy domain positive in like a day. But, such is the nature of a transitional neighborhood—no corporations, just tons of art boutiques.
That would be an incredible gift to spammers. Just keep spamming until you see your domain health go from green to yellow. Then you can calibrate just how much you can get away with.
It's a nice idea, but will just make the spam problem worse.
it's usually pretty obvious to spammers when they get penalized. their traffic drops and if they use webmaster tools (admittedly ill-advised and unlikely) they get an explicit message telling them they've been caught.
Yeah, perhaps something like a domain report card highlighting any problems their system has currently flagged for it and linking to solutions. Would be useful even on domains you've owned for a long time, just to help determine if there are any concerns.
For example, when Google first started the Panda updates, I read that some sites were being targetted via paid link buys NOT purchased by the actual owner. Would be nice if there was a tool that let the owner find out that was what was happening to his site.
The Penguin update targeted spam links not Panda (the Panda update penalized thin/spun/lame content). It is confusing...they clearly don't support color diversity :).... In webmaster tools you can report incoming links that are spam. Google has along way to go. Doing a backlink analysis before a domain purchase is time consuming and most people don't know how to do it.
Spam or any other phony domain is dangerous to purchase. Such domain could have been serving root document with cache expiration date long in a future that iframes index.html (so the site looks normal) and includes a malicious js controlled by the previous domain owner.
It's worth noting that the converse is also true. If you want to rank easily, buy a domain that wasn't renewed that used to have good SEO - I can fairly regularly pick up sites with a few thousand white-hat links and and a 4 or 5 PageRank, restore them to the old cache, and start from there, for a couple hundred dollars. Those links would cost tens of thousands to create.
There are probably a lot of people that would be pissed at me for displaying that information publicly, but oh well.
Does this still work? I know this was the rage back in ~2005. But then google started to check for registration continuity and drop juice from old links if there was a large gap in the whois records.
It still works. I wrote this[0] near the beginning of 2012; when I wrote it, it was a mystery, but I've since heard from a Google engineer that this happened because the Yellow Pages people purchased a domain that ranked highly for the relevant keywords. In other words, you can buy pagerank.
I had a SEO person quote us $2K to get our ranking up and declined but know another person in our vertical that used them and is in the top 10 now on organic search.
One of the problems of creating good content is that we are still competing with SEO spammers even if its short term.
So the competitor who paid the SEO is a spammer, the person who paid Google AdWords is an angel & the person who has no exposure has no traffic nor revenues, but is virtuous with great content (that almost nobody cares about other than the owner)?
Always collect emails of customers and invite visitors to opt-in to your mailing list.
This way if Google will change it's mind the 1,000,001-st time about ranking you - you will still have direct communications channel opened with your prospects and customers without Google blessing.
Google wants every site to buy ads, so they engage in everything possible from adding dozens of ads on pages, to demoting entire niches, to FUD to force you to advertise. When Google has ads for a certain keyword, they don't care much about "organic" results, they care that ads are better. In fact, several threads at Webmaster World have shown a lot of obvious examples where Google makes results worst to have a better adclick ratio. Everything Google does is to make money via ads, from Chrome to Android to Search updates, especially during Larry Page's rein. That is the bottom line, the rest is fluff.
Google Shopping is already 100% ads, the rest of the commercial results is 99% ads and advertiser sites (usually huge brands that spend billions on Adwords). Disclosure and ethics however aren't Google's strongest points, lobbying governments is.
"Spammy" domains is a a smokescreen. The issue is lack of transparency and accountability of Google when it comes to penalizing domains. We should simple be able to check this, just like with any RBL.
It's highly inappropriate to call those who fall victim to this "suckers", but it illustrates once again how Google feels about the little people.