Hacker News new | past | comments | ask | show | jobs | submit login

Not sure about now, but I worked in the T&S Webspam team (in Dublin, Ireland) until 2021, and we were very much enforcing cloaking.

It was, however, one of the most difficult types of spam to detect and penalise, at scale.




Is it even well defined? On the one hand, there’s “cloaking,” which is forbidden. On the other hand, there’s “gating,” which is allowed, and seems to frequency consist of showing all manner of spammy stuff and requests for personal information in lieu of the indexed content. Are these really clearly different?

And then there’s whatever Pinterest does, which seems awfully like cloaking or bait-and-switch or something: you get a high ranked image search result, you click it, and the page you see is in no way relevant to the search or related to the image thumbnail you clicked.


Whatever Pinterest does should result in them being yeeted from all search engines, tbh.


Apologies for not responding quicker.

For context, my team wrote scripts to automate catching spam at scale.

Long story short, there are non spam-related reasons why one would want to have their website show different content to their users and to a bot. Say, adult content in countries where adult content is illegal. Or political views, in a similar context.

For this reason, most automated actions aren't built upon a single potential spam signal. I don't want to give too much detail, but here's a totally fictitious example for you:

* Having a website associated with keywords like "cheap" or "flash sale" isn't bad per say. But that might be seen as a first red flag

* Now having those aforementioned keywords, plus "Cartier" or "Vuitton" would be another red flag

* Add to this the fact that we see that this website changed owners recently, and used to SERP for different keywords, and that's another flag

=> 3 red flags, that's enough for some automation rule to me.

Again, this is a totally fictitious example, and in reality things are much more complex than this (plus I don't even think I understood or was exposed to all the ins and outs of spam detection while working there).

But cloaking on its own is kind of a risky space, as you'd get way too many false positives.


I think they must be penalized, because I see this a lot less in the results than I used to.

And byw (unless we are talking about different things) it was possible to get to the image on target page, but it was walled off behind a log in.


Do you have any example searches for the Pinterest results you're describing? I feel like I know what you're talking about but wondering what searches return this.


Curious. How is it detected in the first place if not reported like in this case?


sampling from non-bot-IPs and non-bot UAs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: