Hacker News new | past | comments | ask | show | jobs | submit login
Inside Google's Secret War Against Ad Fraud (adage.com)
108 points by lxm on May 30, 2015 | hide | past | favorite | 57 comments



They make it sound a lot sexier than it is. When Blekko was serving up search pages we had over 200,000 machines at any given time identified as botnet/fraud sources. Perhaps it was because we were providing search results to "tier n" search portals that we got to see more of the folks trying to scam the advertisers but we got really really good at not serving up ads or counting clicks on robots. Something ad publishers really appreciated.

If you're serving up SERPs there are lots of interesting things you can do, like make white space clickable but not go anywhere, (humans click on the ads, bots click on the giant click box.) Serve up an invisible "first ad" on the top left, center, and right of the page. Tons of these robots just click the top ad they think they find. When IP's in europe, brazil, mexico and the Ukraine all search for the exact same english string and always click on the third ad down, well you can just put all of that addresses in the equivalent of an Ad "virtual kingdom" (they think they click but they don't go anywhere useful).

Lots of funny stories, my favorite was sending Google traffic to Google ads and having Google tell us it was fraudulent traffic. That was pretty funny.


My guess is, given the scale, and amount of money on the line, the attacks against Google are far more sophisticated than just bots blindly clicking on ads.


I would hope so! But even a place like Ask.com is clearly a couple million dollars a day in ad revenue.


I wrote an ad system that was in part used to backfill Ask in a few countries and even with just backfill I was amazed at how much traffic they were still doing. Though I'm pretty sure most of it was toolbar traffic.


Yes, there is so much money in it that it's very likely there are bots with an AI trained on real human behavior samples, making it hard to distinguish from real clicks.


Actually easier than this, a bit of spyware that looks at your searching/clicking habits over a very long period and then sends that to a big mixer which then replays the same search/click with the same timing and mouse motion. I remember seeing a World of Warcraft automation script being demonstrated and thinking that it was probably used to abuse ad networks.


Sad of the demise of Blekko search engine. It's now part of IBM to build their inhouse "freebase" style data source for Watson AI, as Google closed freebase.com

I wonder what DuckDuckGo will do, as their primary search source the Yahoo BOSS API got more expensive in June 1, 2015: https://developer.yahoo.com/boss/search/#pricing ($1.80 per 1000 Queries; the free tier is no more)


$1.80/CPM should be absorbable with ad revenue, but I fear the average DuckDuckGoer runs ad blockers, blocks javascript, runs Ghostery and doesn't click on ads very often...

Looks like the old pricing was $0.80/CPM. Any idea if they have a direct contract with DDG?


> When IP's in europe, brazil, mexico and the Ukraine all search for the exact same english string and always click on the third ad down, well you can just put all of that addresses in the equivalent of an Ad "virtual kingdom" (they think they click but they don't go anywhere useful).

How much of a server drain is the "virtual kingdom"? Could you point those trap ads to some host or IP address that takes a long time for the client to reach, offloading the wasted time from your server to their client?


There was a time when Google was a likeable company creating good search engines. Now it is all about battles over the ads. In what sense is that still beneficial to an ordinary web surfer? Is this an inevitable progression or is it a matter of wrong priorities?


If you want good search engines without having to pay to use them, then those search engines need to run ads...


What you say is probally true. I just sometimes feel we've been cajoled into believing a good search engine needs to be funded with huge amounts of revenue from ads, and tracking?

That said, I wouldn't mind if Google, or Bing charged my IP provider(Comcast) for the right to use their services? Yes, Comcast would turn around and tack the extra charge to my account, but they might not?

A lot of us on here are partial to Google, but the non-tech people in my life really don't care as much as they used to. I haven't heard "Google it" for awhile now? I have heard a lot of people say, "I use Bing because of the pretty pictures?", and I'm around a lot of people who still don't quite know which search engine they are on--Yahoo seems to be sneaking into browsers--like some kind of malware?

I imagine by now, Google has most of the good patents that make running a good search engine possible, but maybe they don't? And when will these patents expire?

A good search engine is so important, I wouldn't mind if government didn't get into the search engine game? (Please don't tell me government is too inept, or ruins everything they touch. I don't know if they could do a good job? I wouldn't mind converting the NSA into one large search engine--of course without the spying?)

In summary, I don't know the true cost Google incures running their search engine? I don't think we will ever get that number at this point? I do wish DuckDuckGo well though. While writing this comment I keep thinking about what the person above me said--something about Yahoo BOSS API taking away free search tier, and charging $1.80 for a 1000 queries? Whatever the case, I hope DuckDuckGo finds a way around this, and keeps improving.


I still hear lots of people of all sort say "google it", young and old alike. I do see a bit more variety than recently, though, in terms of people using Yahoo and Bing.

As a side note, out of curiosity, are you a native English speaker? Your use of question marks seems a bit bizarre.


I tell people to "bing" things, and I get a lot of weird looks, and very rarely hostility.


Is there a good ad-free search engine that I can pay for?


It is the interesting question of our times isn't it? I did the calculation for Blekko at one time and a couple of million paid subscribers could have floated the base search engine. It is hard to find people to subscribe to things on the internet though unless they are MMO games :-)


Gamify the search engine experience? Give people points for each distinct search query? Have a leaderboard? Have people compete on who can find the answers to questions the fastest using a search engine? Have quests of knowledge where people have to journey through the internet solving problems for other people using search engines to gain points? Sell people power ups during the competitions/quests that allow for them to more efficiently parse through many sources of information?

Idk, the search engine experience has remained the same for what I can see for a couple decades now. I'm not surprised people wouldn't want to pay for that. But if search engines could be combined with aspects of be a mmorpgs, I'd might pay for it, or at least sell my account after I racked up enough points for someone who wants it :P


congratulations cinquemb for being the first to search for "Thailand sex tourism" enjoy your new achievement in your profile! click here to share on Facebook and Twitter!


Even for MMO games, the freemium pay-to-win games are a lot more popular than the straight subscription games.


Even these days, getting people to pay for video games is hard.


The core problem is that while you might be willing to pay a small amount for search results, advertisers will pay a huge amount for your personal info ( or equivalently, highly targeted ads).

Remember: you are the product.


You misunderstand the ad market.

Search ads are worth 25X as much as display ads.

Search ads do not get a significant boost from personal info. Display ads do get a significant boost.

Targeting companies would show up to try to buy info from Blekko all the time. We weren't interested on principle, and, they were only offering less than 1% of what we were making off of search ads.


The issue with search engines is that they tend to get very good by using clickthrough data, so the more you use something the better it gets. Indexing and ranking algorithms can only do so much.

The interesting question is that does Google own the click through data or does the user? Does Google exclusively own the fact that you clicked on the 4th link in Google leading to php.net when you searched for 'php'?

Or can the user or government get this data to alternate search engines for better competition just like they forced Microsoft to open up Office file formats and SMB protocol docs to their rivals?

If you remember the big deal about the optional Bing bar in IE uploading the search term and clicked URL in Google search results to Bing it looks like many people on here think Google exclusively owns the clickstream data, and it's not the users' to do what it wants.

This is the reason that a paid search engine won't do very well, just like a paid social network wouldn't, because both rely on a lot of people using it to seed it and a paywall will hurt.


Isn't this a tad presumptuous? Google is a large company - there are likely as many engineers working on ads spam as search quality.


For every search if the thing you want is the first result, I'd imagine there would be quite a big drop in revenue.


I work in adtech, specifically on the bidding side on real-time click forecasting and ad-pricing.

I have to say that compared to other ad exchanges. AdX (google's ad exchange) is far and ways ahead of any other ad exchange in minimizing the the amount of fraudulent traffic that we as ad buyers get exposed to. I did them serious props.


Besides the obvious scale differences, what have you seen to be big differentiators between google and other ad exchanges?


In terms of fault, we're doing what we can to filter it out when it hits us, but nothing particularly sophisticated yet. On other exchanges, our filters filter out some significant percentage of traffic, but the same filters applied on Adx don't get triggered often. The click and conversion behavior of ads shown number seem to suggest that this is due to google actually doing a really good job of filtering the fraud before it hits us.


also being an volume ad middle man in the industry, I can bet what the top comment was referring was price.

adx is the cheapest, because it's the worse. your ad will be seen by users in spots previously occupied by "the banks hate this guy" kind of click bait ads. which have ridiculous ctr!

everyone I run experiments there I keep getting calls from clients to take their ads out of some and some sites before brand sees it. it's a nightmare.


No, not price, although they are pretty cheap and run a fair auction system (unlikely other exchanges...looking at you facebook/appnexus). The CTR on adx is actually one of the better ones, and they tend to have less junk inventory.


I'm reminded of 'Google Will Eat Itself'[1], which purportedly used this technique to purchase stock in Google, and also pay for additional ad-clicks, with the goal of buying out Google entirely from received ad revenue.

I don't know (or really believe) that it ever actually operated in the way described though. More of an art/thought project.

[1] http://www.gwei.org/pages/diagram/diagram.html


I remember when that was published and commenters laughed at it. I specifically remember a sophisticated explanation about how GWEI would never work because the second derivative of their GOOG would never something or another.

Today GWEI owns $400k in GOOG.


"Sasha and a number of his fellow Google employees asked to be referred to by their first names, saying they were concerned for their safety”

"The rapport between original team members and their new peers was apparent as the crew gathered at the Craft Beer Co., a homey London pub"

and they show a picture of some of the team.

Not too worried then.


Guess how many out of 1000 visitors from my last AdWords Content Campaign loaded the favicon.ico ? And how many of them do you think signed up to to my service!?

Answer: 0/1000 (nada, zero)

Adwords for search worked much better though (real users).

I also tried Google ad-sense for some of my websites, but the quality of ads are very bad and no-one clicks on them.

Around ten years ago, Google ads used to be "the shit" (everyone used it) and you where able to both earn money and get quality traffic. Have the publishers turned to other market places or is my current experience what the world wide web ad market looks like today!?


I wish they would get off their butts and wage war against the referral spam problem that has made analytics useless for low traffic sites. The current filter system is not good enough to combat them. They need to provide a special filter with a curated blacklist that automatically removes all past data logged from the spam sites.


I would say Google's move toward HTTPS by default made analytics useless. You can't tell what search terms were used to arrive at your site.

For my low-traffic sites, I would mine them by hand to figure out what people really wanted, and then develop content around that. I don't understand why Google would not want to incentivize that.


Google removed itself from the IAB's working group on safe frame, which fought for user and publisher safety. among things like preventing the ad from knowing what's on the page it's served on it also handle if the ad is in view, preventing most of those bot net traffic that uses 1x1 iframes or insert ads with low zindex on the page. the only vulnerabilities left with safeframes are giving the user a executable when they click on the ads (nothing can prevent that) and flash exploits (although the most common exploits are blocked fine, it's still vulnerable when it's a full control bug in flash)

but Google got out when they realized it wouldn't give as much info from its adsense network of third party sites as they are used to with insecure adwords (when it shows display ads)


Google's ad network relies on knowing about the page the ad is embedded in both for targeting (AdSense) and apparently fraud prevention.


exactly. it is a trade off. user privacy vs price on expanding inventory. they choose the later


It looks like taking down botnets is important to all search engines, here's a old news article about Bing taking down a botnet[0].

Similar to Bing, does Google also pursue and do an actual takedown of these botnets? Most of the article was about showing off how nasty these exploits are, nothing about taking them down.

[0] http://www.cbc.ca/news/technology/huge-zeroaccess-botnet-dis...


WinLister, from the article..

http://www.nirsoft.net/utils/winlister.html


To me this reads as one big Google propaganda ad (by inviting a journalist).


Google style - like they do it with their webspam team.


I'm not sure why the criminals are bothering. AdSense clicks earn almost nothing in my experience. I guess they could be making money by charging businesses to click on their competitors' ads and drain their budgets, but there are only so many companies that are willing to pay to engage in clearly illegal activity like that. The primary beneficiary of the fraudulent clicks that are good enough to evade Google's detection is....Google.


Your experience is not universal. :)

There are ways to make clicks worth more. Be from a US IP address. Stuff your browser with retargeting cookies by visiting high-value brand sites (e.g. car manufacturers, insurance websites, loan products, ecommerce stores).

I've worked on the network side of ad ops and seen fraud. Amateur, easily detected and squelched fraud. It amounted to several thousand dollars a day. If you were the right combination of talented and not-overly-greedy, you could make a comfortable living by doing this fraudulently.

> The primary beneficiary of the fraudulent clicks that are good enough to evade Google's detection is....Google.

The primary beneficiary is the person getting the lion's share of the ad dollar, which often isn't Google. Don't get me wrong, Google has some perverse incentives here, too, but I suspect they'd be way more happy with a 0% fraud world than with the status quo.


If you don't provide value to your customers (Advertisers) due to fraudulent clicks, ultimately your product is less valuable, and you make less money. I think Google's incentives are alright here.


Not in the long run. If Google ad customers aren't seeing value for their ad spend, they'll take their dollars elsewhere.


I'm not so sure. I always figured the cost per click would drop accordingly. If fraud is 50%, advertisers would only bid 50% for a click/impression, and it would be a wash in the end for the advertiser and Google.

I think the real hit is against content creators who aren't able to extract the full value of their work, instead they split the revenue pool with a bunch of adspam sites.


Couldn't Google keep cost-per-click artificially high, benefiting themselves at the expense of advertisers?


I think they're too ethical and just charge whatever their auction system calculates.

Google wants advertisers to be sustainable, not bankrupt them as soon as possible.


Adsense might not pay a lot for most things but there are still a lot of high paying keywords you can target and certainly enough money to be made to draw out bad actors.


Yep. I've built admittedly high-margin and low-volume sites, but I've averaged about $0.75/click on Adsense for several years. I haven't even done any A/B testing either.

Now if I could only ratchet up the volume...


see the big picture.

every ad networks have fraud.

fraud causes ctr to not equal advertiser profit.

advertiser pays less and less for ads that do not create profit.

you get less and less for your site.

fraudsters get more bots and keep their earning stable while prices goes down.

back to step one


"Thanks to Google's massive size, the blueprint can then be overlaid on top of Google's wealth of impression data to find chunks of traffic that match up."

    'z00clicker vs. normal click density' 
Any idea how/where the *normal click density described is sourced from?


Maybe they have a sample user group who are equipped with some keyboard/mouse logging type system that can report to them?

They could even do it to all (or some subset) of visitors via Google Analytics[1] to report the (x,y) position when click occurs. It wouldn't surprise me if an awful lot of the internal use of GA is for ad-fraud detection.

I recall way back in the day some ad networks were using imagemaps[2] instead of plain links because they'd report an (x,y) postition relative to the image, and could probably filter out the really dumb clickbots that would hit the precise pixel centre of the element every time.

[1] https://developers.google.com/analytics/devguides/collection...

[2] https://en.wikipedia.org/wiki/Image_map


"They could even do it to all (or some subset) of visitors via Google Analytics[1] to report the (x,y) position when click occurs. It wouldn't surprise me if an awful lot of the internal use of GA is for ad-fraud detection."

This is what I'd suspected, randomised sampling.


While fraud of any kind is certainly bad, I don't particularly feel enthused about cheering for the advertising industry as the "good guys".




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: