Hacker News new | past | comments | ask | show | jobs | submit login
How I recorded user behaviour on my competitor’s websites (dejanseo.com.au)
766 points by lukestevens on Aug 23, 2018 | hide | past | favorite | 309 comments



I’d like to defend this guy. What he is doing is testing the trust mechanism.

If he went to Google and said ‘I think the trust mechanism is broken’ Google would say: ‘We know, that’s why we are pushing to move everyone to https.’

‘That isn’t enough. The padlock on the https page gives users a false sense of security.’

‘We don’t agree with that. Where’s your data?’

Google wouldn’t have accepted this. They have pushed full HTTPS hard, and suggesting that it has a negative consequence is unacceptable to them.

His experiment has proven the problem. How else could it have been demonstrated?

Ideally this would have been a large scale study done by academics. But this guy doesn’t have those resources. Nobody is going to fund this research.

The depressing thing here is that everybody is more interested in calling this guy a jerk than dealing with the issues he has raised.

Trust on the internet is broken. This guy did it with ease. Imagine what is being done by those who want to scam millions?

But yeh, call him a jerk and then you can bury your unease beneath a big pile of outrage. It’s fine. Fine. He’s a jerk.


Thank you. I'm not having a good time at the moment. Anyway, the basis of my test hypothesis is that people are easily fooled by URL both by HTTPS and brand recognition (e.g. subdomain) so I conducted a survey which revealed the very real problem: https://dejanseo.com.au/trust/

Raw data: https://dejanseo.com.au/wp-content/uploads/2017/04/survey-te...


Hey man, I know how hard the hate hits when you explain something like this to a community. It happened to me here too when I talked about the mass weaponization of autonomous systems via cyber attack. One guy said I was somehow right and a crank at the same time and dismissed one of my conclusions out of hand without addressing any of the reasoning behind it. I hurt at the time, but I came to understand it wasn't really directed at me.

The thing you got to realize is that many here make their livings trying to secure systems and we're finding it hopeless. The way you did what you did was fine. In terms of proving the hack you needed to violate Google's trademarks. It's in the very nature of the hack, and as far as I'm concerned, warranted given that they have a bug bounty. Now, I probably would have disclosed it to Google, Bing, etc. ahead of time, but it's your bug. You could have sold it to blackhat scammers and you didn't. For all we know this hack could have been going on for years.

I think most people are confusing their anger at the situation with anger towards you. You're cool.


This was my experience working on election integrity issues.

No good deed goes unpunished.


Thank you! :)


Don't listen to the haters here. The same people upvoted this article 3 days ago, and then promptly forgot about it https://news.ycombinator.com/item?id=17799083


I think you'll find that a lot of people on this site—from lots of political leanings—believe there's a wide gulf between "This behavior is a bad idea and we should use social mechanisms like debate to discourage it" and "This behavior should be illegal and we should point the government's monopoly on violence in your face to make you stop."

The flip side of the defend-to-the-death quote is that caring about someone's right to speak, even caring about that position being well-represented, doesn't mean you have to agree with what they say.


The irony...


I believe that you acted ethically, unlike Google. The history API should be locked behind one of those: "RandomSite.com wants to use the History API: Allow / Deny" dialogs, and the TLD and second-level-domain should be clearly marked in browsers, to prevent this sort of https://google.com.search.mydomain.cz schenanigans


For what it's worth, I love reading about this stuff, though I specialize in InfoSec so this sort of thing is actually pretty common in our communities.

You would have definitely had a much easier time with them than you are right now.

But for what it's worth, this will blow over soon enough, the internet does not have the greatest memory (unless you actually did something horrendous, which you didn't)


I hope so, and I also hope Chrome gets a fix for this.


I read the article, but still don't get what's Chrome-specific about this vulnerability, or what a good fix would look like.

My reply to someone who proposed making the back button always go to the previous URL: https://news.ycombinator.com/item?id=17826406


The issue is that some web applications don't load what traditionally were discrete pages (e.g. PAJAX) with their own URLs. It's a trend you'll find in sites built to feel more like applications. Scroll the the bottom of an onion.com article and watch your URL update to the next page without a page refresh. This was done so modern sites built like this could still allow the user to navigate back and forward. It let's the site update the browsers location history and effectively what URL that back button will point to. I could imagine blocking this behavior if it points to a site off the TLD and it's sub domains. Hard pressed to figure out how they could prevent this, definitely a flaw in the trust model but probably worth the trade off.


I could imagine blocking this behavior if it points to a site off the TLD and it's sub domains

This would not address the vulnerability in the article.


Hopefully, but even then, it's good that you are making people more aware of just how sketchy it can get.

Chrome will always have nasty exploits, because it's dealing with the flexibility of the world wide web. It's more important that we the users are aware of the tricks that attackers employ, rather than having clean solutions.

I don't trust that any software is secure, and to date that mindset hasn't burned me yet!


We need experiments like this.

Clinical studies, try to address various factors beyond does the drug technically work, but does it work in practice (coping with people doing everyday things like having dementia, drinking or babies).

We have a flawed obsession with responsible disclosure (that we should mandate includes public disclosure). What we need is a framework for Software Studies that allows any nature of research including in at risk areas and they should answer to ethics committee and regulators, not a disclosure terms of service from the company likely to be put in a bad light.

We need an equivalent to ICH GxP. Drugs have to deal with all the same craziness as software, they're just centuries ahead at how to do it (although they still fail at public disclosure).

Was this study appropriate? Whilst Google corrupts the security integrity of the internet with its Ad and Analytics system, it shouldn't be complaining. For the rest of us, I think we need to pressure for regulation if you want to draw lines and look to the drug industry for inspiration. At the very least we need InfoSec Trials if not the whole suite of Software.


Did we not have enough evidence already that this is true??!


I'm willing to give you the benefit of the doubt and assume you were just unaware of how things are supposed to be done (reporting exploits to the vendors privately and waiting for the fix before going public), but man, you did a fantastically dangerous thing even if it was unintentional.

I'd never condone beating up on somebody on the internet, but I dearly hope you've learned a valuable lesson here. You've put lots of people in danger of being exploited. It's not about whether or not you'd do anything malicious with it, it's about all the other people who now can because Google doesn't have a fix out there yet.


This is the misconception I can't stand. Where we hold individuals responsible for a product / companies defect. I thoroughly disagree with the idea that it's his fault people are vulnerable.

So called responsible disclosure is just a marketing spin term. Disclosing bugs privately is a favour not a responsibility. All this does is reduce the risk of bad software decisions. It doesn't solve anything.

How about free market instead? If you run a multi-billion dollar company that can be hurt by issues like this, then it's on you to make it more profitable to disclose issues privately. If you can't or refuse to do that, then you're exposing your company and your customers to risk. Enough with the shunning and the "responsibility" of individuals which expose bugs.


I sympathize with THIS position. It’s the same blame shifting crap when “identity theft” becomes your fault, even though any cashier or clerk can “steal your identity”.

What this marketing spin does is give cover to those who design badly secured systems.

http://www.youtube.com/watch?v=CS9ptA3Ya9E

Also similar is the “jaywalking” idea, made by car manufacturers to make the default right of way to cars!

http://amp.charlotteobserver.com/opinion/op-ed/article650322...


[flagged]


Google? When I brought a serious issue up in 2012 https://dejanseo.com.au/hijack/ Google never fixed it:

In summary, I can take any of your (or anyone else's content) pass more pagerank to it than the original page and then I become the original page. Not only that but all your inbound links now count towards my site and I can see your links in Search Console of my domain.

This is something link graph theory refers to as "link inversion" and is very harmful to smaller publishers.


I can't speak to that particular exploit, but no matter what you always go to the vendor privately first. Period. If they are uncooperative you can then go public. Not before.


I'm not sure how to respond to your comment (for the record I didn't downvote you). The free market point was obvious to me, but I'll elaborate.

When he chose to expose this bug, either he wasn't aware of an alternative (so called responsible private disclosure) or that alternative just wasn't appealing enough. Since we're dealing with a company that generates income (indirectly) through the product, they risk financial consequences from this sort of exposure. It follows that doing more to incentivize and generate awareness of their disclosure policy would reduce their risk which would have a financial impact. It's up to them to decide how much to money / effort / resources to spend on reducing that risk.

My stance is that public shunning doesn't solve the problem of releasing buggy software. I'm actually a Google fanboy, but (to me) they could do better. Instead we get "The site is completely removed from their index without any notification." Maybe we need to elevate browser security to the level of Space Shuttle safety? Obviously that costs more and takes longer, slowing innovation, but IMO the market should determine that.

TLDR; The idea that the individual is responsible for exposing a companies bugs is completely absurd to me. I'll respect you having a different opinion on it.


Please don't do this.

https://news.ycombinator.com/newsguidelines.html (see "idiotic")


I mean, Google has been told (over and over) for a long time that HTTPS doesn't fix trust on the web being broken, and that the back button shouldn't have an API. These are both well documented security problems. What has happened now is that Google is under public pressure and scrutiny to actually fix these things. A fire has been lit under their bum, and rightly so.


I believe they did remove the green lock from https sites to avoid implying trustworthiness. And removing the back button API is something Google can't decide on their own; it has to go through the standards process.


> I believe they did remove the green lock from https sites to avoid implying trustworthiness.

Nope. I'm on build 68.0.3440.106, the latest public stable build and as I'm writing this comment, little green lock and "Secure" right next to https://news.ycombinator.com.


Its a process. The green lock will be eventually removed, and instead of the "LOCK Secure" you see now, you'll see nothing, and http-only sites will be "Unsecure". This you can already see if you go to a non-https site like neverssl.com. There's a "Not Secure" banner in white.


Trying to be objective and understand my own motivations here. Obviously I didn't do anything out of malice. But yes, I could have told Google directly about the problem, but then I'd have no cool story to publish on my blog. At the end of the day, that's what it boils down to. Now that I got too much attention from it, I regret all of it.


"I could have told Google directly about the problem, but then I'd have no cool story to publish on my blog"

First of all, you definitely would. Standard practice is 1) report the bug privately, 2) wait for a fix, 3) get the go-ahead to publish your report and take credit publicly. That's how it always works; that's how security researchers build their reputations and careers. I guess you just weren't aware of that.

Second of all, even if you wouldn't get to publish it, that is horribly selfish reasoning. Putting millions of people at risk of having their information stolen for the sake of a popular blog post?


I fail to see how dejanseo put the people at risk. Exposing how a tool is dangerous and poorly conceived isn't the same as conceiving a dangerous tool.

In this case, Google put millions of people at risk, and dejanseo actually contributed saving them.


Right; but sometimes someone is the first to have an idea or realize a vulnerability, even if it seems trivial to them. Once it's public, novelty is no longer a factor, and it is a good idea to allow the vendor a chance to remove that vulnerability before the novelty is clearly eliminated. Obscurity does actually matter in the real world, even though it is a useless design principle.


That's right.

But while there are a lot of domain where I don't accept the reasoning "someone else must have thought about this before", finding vulnerabilities is somewhere where I can't help but believe that every publicly disclosed vuln has probably been secretly exploited and sold for years.

(The only data point I have behind that is that there are nations level agencies pretty much dedicated to finding those, and they've gotten really good at this (cf Stuxnet !)).

So, while by conviction only, I highly doubt any independent white/gray hat vuln finder will ever be the first to find it, and I applaud any kind of disclosure.


Yes, the reveal is required. But it doesn’t have to be without the vendor’s knowledge. The rush to get it out without allowing the vendor to respond is unjustified and reckless. The TLAs using the vuln are keeping it a secret, after all, and the script kiddies enjoy public trashing of people which I think is worse than the TLAs careful abuse.


FWIW, "the padlock is enough" is quite the opposite of Google's position:

https://blog.chromium.org/2018/05/evolving-chromes-security-...

and in fact one of the main reasons is that use of HTTPS is far too little information for the browser to affirmatively indicate "This site is secure and trustworthy." So they are planning to get rid of the padlock. (Use of HTTP is enough for the browser to affirmatively say it's insecure, though.)

So I think Google understands that one of the consequences of pervasive HTTPS is that the padlock is at best meaningless and at worst misleading, as we saw here.


I can agree firsthand, people think a site, any site, is safe because of the padlock... :facepalm:


So you are implying that HTTPS made this attack easier or more impactful?

I don't buy it. This same attack would work the same with or without HTTPS having existed, and the only reason it wouldn't work as well in practice is because HTTPS is a baseline of security.

It's like saying that airbags cause people to trust unsafe cars. An HTTP only site is a red flag now, but HTTPS just means it won't be instantly considered untrustworthy.

HTTPS has massive benefits, and Google is already starting to "deprecate" the green padlock (IIRCthey have plans for HTTPS to be "normal" with no green padlock and HTTP to be marked as "unsafe").


There is a similar debate about making wearing bicycle helmets mandatory. [1] One problem is basically that with cyclists wearing helmets, they and drivers around them might think that smaller safety margins are necessary. (Both physically as drivers drive closer to them and e.g., cyclists more likely to drive at unsafe speeds.)

I think the argument here is the same: the green padlock makes people feel too safe. I could easily buy an argument that if HTTPS was not highlighted prominently as a SAFE thing by the browser, people would pay more attention to other indicators such as the domain when browsing the internet.

[1] https://discerningcyclist.com/2018/05/mandatory-bicycle-helm...


But as your parent pointed out we _already know_ that padlocks for HTTPS are the wrong UI here. The goal is to get to the right UI, which you can only do after getting to very high HTTPS usage rates, which we've been working on for several years already.

Tim's toy hypertext system from last century doesn't have confidentiality or integrity at all and the authentication mechanisms are garbage (which is why nobody uses them). So adding these necessary features has been a retro-fit for the past 20 years or so, and unfortunately the original attempt at the retro-fit was done by people who knew nothing about security UX. Which is understandable, this was the era when people thought PGP was usable.

So, we have to get from this cul-de-sac we were in 10+ years ago, to the correct approach, which means some U-turns and all the major browser vendors are more or less on board with that. The padlock will go away (at least from the main UI) as part of the journey, but it hasn't gone away yet because we're not finished. Notice that even going as slowly as we have, every time there's an incremental move Hacker News is full of people screaming about how awful this is, they can't be expected to handle this pace of change...


It looks indeed like "The Emperor's New Clothes" story of Anderson [https://en.m.wikipedia.org/wiki/The_Emperor%27s_New_Clothes].


Agreed. Forget a huge, third party like Google, many times (at least in my experience) even our own bosses in small companies wouldn't listen. And even if they did listen, they wouldn't act. Unless of course, it is proven with data and it is big enough to cause them a headache. I'd guess mostly it is due to laziness and not any malicious intent.


>‘That isn’t enough. The padlock on the https page gives users a false sense of security.’

>‘We don’t agree with that. Where’s your data?’

Where is your source that this is Google's position? Considering they have some of the best security employees in the business, I find that hard to believe.


Allowing sites to intercept browser actions that should make a user leave the site, and inject other operations is obviously and plainly a security issue.

I reported this to google several years ago, and it was never addressed.


Can't you do the same thing without JavaScript, by having the web page go through a brief redirect so the back button takes you to the redirect?

And if so, how do you solve this? Ban server-side redirects? Make the Google SERP iframe all sites it takes you to? I agree this is a problem but I have no idea how to solve it in a way that's not worse.


Maybe if you visit a page for only two tenths of a second, the back button skips it.

It won't fix the whole problem but it's a start.


>Allowing sites to intercept browser actions that should make a user leave the site, and inject other operations is obviously and plainly a security issue.

Sure, I agree. But what does it have to do with the parent comment's claim?

I've read much of the discussions involving the early push for HTTPS, and the developers involved were very fastidious.


The point is that google should be penalizing sites that do that in search results, or at least in chrome with some kind of browser recognition - as the OP states in the article.


They should do more than penalize them in search. They should add them to their malicious site list.


Raising awareness is what he want, being called a jerk is the price he has to pay. Defend +1.


FWIW I don't think he acted like a jerk at all. /$.02


>Google would say: ‘We know, that’s why we are pushing to move everyone to https.’

Am I presuming too much to think Google's primary motivation for pushing HTTPS is to protect their revenue model by preventing their ads from being replaced as opposed to simply being motivated by benevolence?


Google has shown time and again that they're open and enthusiastic about receiving properly reported bug reports which give them the chance to fix things before hitting the web. Usually that includes compensation. Why would you think this one would be any different? Maybe this guy just wasn't familiar with proper practice, in which case, well, what can you do. But it's extremely bad to go public with bugs without talking to the vendor first. How many sites might exploit this between the blog post going live and Google rolling out a fix?


Hi everyone! I did this. It was just a random cool idea I wanted to try. It worked a little too well and I quickly moved it to a disposable site to test if the page will get penalised by Google. I got busy with other things and forgot about it. When I bumped into it again I decided to write about it, for two reasons: 1) To me it's hard to believe that Chrome would allow for this to happen in the first place and 2) that Google wouldn't penalise a site doing this. Well, since the story was published Google tracked down my test page (most likely by using the source code I revealed on my blog) and completely de-indexed the whole domain.


You did good by publishing this. I've seen that you're updating your blog post based on what people are saying here, you don't have to do that, you don't have to answer to people attacking you on a forum.

What you have exposed has the potential to affect a large number of Google users and unfortunately the community has chosen to attack you over attacking Google.

Which could say a lot about the state of the community. So thanks again for bringing this vulnerability to our attention.


you don't have to answer to people attacking you on a forum

Maybe you don't feel like you have to, but I can tell you from experience, that when an entire community of your peers piles on to you, there is a significant emotional response that you're being rejected. That's just my personal experience, but it seems pretty common to want to respond when those you respect and work with (or might work with) respond negatively to your work.


It's sad that everyone is being so harsh to you just because you decided to post about a vulnerability that who knows thousands of other people are quietly exploiting for their own benefit. If anything I am happy that instead of trying to misuse it or keeping it a secret you made it public knowledge so that there can be something done about it.

Yes you could have handled it more appropriately and you probably will in the future too. I just don't understand the harsh attitude and all this legal nonsense and insults being hurled at you for no big reason.


Howdy, former Matasano pentester here.

FWIW, I would probably have done something similar to them before I'd worked in the security industry. It's an easy mistake to make, because it's one you make by default: intellectual curiosity doesn't absolve you from legal judgement, and people on the internet tend to flip out if you do something illegal and say anything but "You're right, I was mistaken. I've learned my lesson."

To the author: The reason you pattern-matched into the blackhat category instead of whitehat/grayhat (grayhat?) category is that in the security industry, whenever we discover a vuln, we PoC it and then write it up in the report and tell them immediately. The report typically includes background info, reproduction steps, and recommended actions. The whole thing is typically clinical and detached.

Most notably, the PoC is usually as simple as possible. alert(1) suffices to demonstrate XSS, for example, rather than implementing a fully-working cookie swipe. The latter is more fun, but the former is more impactful.

One interesting idea would've been to create a fake competitor -- e.g. "VirtualBagel: Just download your bagels and enjoy." Once it's ranking on Google, run this same experiment and see if you could rank higher.

That experiment would demonstrate two things: (1) the history vulnerability exists, and (2) it's possible for someone to clone a competitor and outrank them with this vulnerability, thereby raising it from sev:low to sev:hi.

So to be clear, the crux of the issue was running the exploit on a live site without their blessing.

But again, don't worry too much. I would have made similar errors without formal training. It's easy for everyone to say "Oh well it's obvious," but when you feel like you have good intent, it's not obvious at all.

I remind everyone that RTM once ran afoul of the law due to similar intellectual curiosity. (In fairness, his experiment exploded half the internet, but still.)


Thank you, I did mess up and wish I could take it back. To everyone bashing on me, I'm truly sorry to offend so many people. That was not the intention. This was purely as you describe it, intellectual curiosity.

I really appreciate your comment and hope it's OK that I added it here: https://dejanseo.com.au/competitor-hack/#shawn


The good news is, if you're ever interested in a career as a pentester, this is an excellent portfolio piece. :) (Really!)

Also, don't worry too much. I think everyone knows your heart was in the right place, and ultimately that counts for something.


Don't let it discourage you. It was a really cool finding. I've done everything right before when it comes to disclosing bugs, and I've still had people dumping on me.

You should consider security as a second career if you ever get bored with marketing.


> So to be clear, the crux of the issue was running the exploit on a live site without their blessing.

Well, he wasn't running it on someone else's site, right? All the code ran on his site, so at worst he was guilty of trademark infringement or — if he copy-pasted HTML or rendered the same text — copyright infringement (which he could have avoided by just being a proxy to them, I think).

Or did I miss something? It doesn't sound like he did anything to other sites themselves.


Ah, you’re right of course. I should have been more clear.

To the author: an alternate ending to this story could have been “competitor found out; flipped out; forwarded this to their legal department; your next two years are very unpleasant, even if the lawsuit ends up settled.”

That’s the main reason why you want to get permission and make everyone aware before doing this.

Here’s a small example: at Mtso a coworker had been running a netpen against a certain well known company. They managed to pivot into their network and eventually onto dev workstations. Last I heard, they were grepping through devs’ home dirs looking for admin keys and such, to see how far they could go.

The difference between that situation and this, is that at every single step of the way, Mtso was in constant contact with the target company and the higher ups knew exactly what was happening as it happened. The target company wanted to know how far we could get. After all, that’s what they were paying for.

(Red teaming is even cooler — it’s that, but breaking into buildings.)

But when you’re an outsider, you don’t have any institutional protection. So it’s doubly important to follow standard procedures (see Hacker One for examples).

I thought of a rule of thumb: if you’re getting information from a PoC that might benefit you / your business, it’s not merely a security PoC anymore. It’s an active exploit that you’re benefiting from.

But again, it’s an easy mistake to make without thinking carefully.


For those of us who aren't familiar with the story, the RTM exploding the internet reference is this:

https://en.wikipedia.org/wiki/Morris_worm


Interesting... I reported a variation of this issue to Google back in 2015 and they said they weren't "concerned about the premise of the attack in the bug description. You can always make the back button go to a page under your control by doing a second navigation, e.g., with pushState".


> But again, don't worry too much. I would have made similar errors without formal training.

Do you have any idea how patronizing your tone is?


Nope!

(I meant formal security training, FWIW. Also I know that feeling of "Oh boy, I just pissed off the internet, didn't I?" and wanted to remind him it'll blow over soon. It's not a huge deal, and he'll come out of it with +reputation.)


Back button hijacking has been known for ages. This isn't increasing anybody's security posture. There might be a bit more slack if this was actually new.


As a person who has wasted a lot of time trying to convince Google that a vulnerability is worth fixing, I have no sympathy for them finding out about a vulnerability via a public disclosure like this. They probably would have spent weeks/months failing to understand the implications of the vulnerability only to have the report closed with an auto generated response about phishing not being considered a vulnerability. Keep thinking like an attacker and sharing your findings. It is the best way we can make software more secure.


I don’t think anyone is objecting to what you did as much as how you did it, and how you seem to be proud of flagrantly abusing your ability to duplicate other people‘s intellectual property. I’m hardly a champion of copyright laws or IP in general, but running duplicates of someone ese’s site feels completely wrong to me without thinking twice. Like the suggestion from the pen tester here, which you posted on your blog, this would be a lot different if you had written the article about conduct that seemed professional, respectful and legal.


How is it different from archive.org snapshots from an IP perspective?


Great question. How about we invert that, and you tell me what IP laws justify operating a functioning duplicate of someone else’s entire website, full of copyrighted and trademarked content, for the benefit of your business?

By this logic, I could duplicate any website in the word and operate a copy for my private business. While I am not a lawyer it seems clear that this is not legal (and as if this is the first time the concept occurred to someone!)

I assume archive.org falls under Fair Use. Check these guidelines.

https://tinytake.com/screen-capture-copyright-violation-or-f...

Duplicating your competitors website for analysis to benefit your business fails the first condition. If it were academic research or some sort of public benefit, that’s different than for-profit republishing for your SEO business.


Is Chrome the only browser this trick worked on?


Copying someone elses site and tricking their users to use your copy is a copyright violation and fraud. Nothing cool about it.


Copyright violation? You're literally just "archiving" their website. Exactly the same as Google are doing themselves.


No, you're not just "archiving". Besides the point that archiving itself is already in a legal grey zone, at the very least it has the defence that it presents the website unmodified, in exactly the same state for no other purpose than showing the web as it used to be. Like file-sharing websites, archive websites rely on the fact that it's an automated process and they can continue to host anything until they get a DMCA takedown. Not to mention organisations like Archive.org are literally run by librarians which gives their argument of preservation a lot more weight.

When you're stealing assets and adding your own tracking code, you're transforming the work, which is a definite no-no for copyright and trademark law. Not to mention that by intercepting traffic which was meant for a competitor you're literally interfering with their business and risk fraud charges.


No, you're using their content to gain financially, and in this instance, at their expense. And that's putting aside all the other possible counter-arguments, of which there are many.

I'm no fan of long copyrights, etc., but in this case to me it's a clear cut case.


It's a POC with no intention other than seeing if it would be possible, isn't it?


Except for the part where he said he moved the code to another site five years ago where it has been running since and even ranks highly for some search queries? Unless I’m misreading that paragraph.


While that might mean that it's OK ethically (I'm not sure either way), that doesn't make a difference legally.

If you go and pick the lock of a random house in your city and get caught by the police, I very much doubt that the defence "I was just doing it to see if I could" is going to help you.


If you didn't steal anything, what would the charge be?


If you get caught while doing it you would likely be charged with attempted burglary. It's up to you to convince jury/judge that you didn't intend to steal.

If you only get caught after leaving the premises it is trespassing, since it's apparent you didn't steal. Picking a lock in order to trespass might make the sentence a bit harsher than normal.


So anyone can just come walk around inside your house without your permission, and you think it’s legal and no problem as long as they don’t take anything? I could see that being the perspective in another culture but it certainly isn’t how the US works.


> you think it’s legal and no problem as long as they don’t take anything

Not only that, they can move in!

Here in Belgium a young couple left the country to do volunteering work only to hear from friends back home that gypsies had squatted their house. Official reaction of the mayor of Ghent was "I can't do anything about it ... it's complicated"

Obviously breaking & entering is a crime but if you're "living" there, only the courts can kick you out after following all the necessary legal steps.

UK has (had) similar squatting laws but afaik those were mainly (ab)used in the 90s to throw parties in abandoned warehouses.


The UK has a lot more defences if your _home_ gets squatted. The rationale is that now we're considering two parties who both want to live somewhere, and so the legitimate owner/ occupier wins. Where squatters move into somewhere empty the court has to weigh up on the one hand property rights of the owner who left it empty but on the other the squatters desire to have a home. So these are unequal rights and the squatters may win under some circumstances.

The antidote is desirable for a community. If you don't want squatters in a building you never live in, let somebody else live there instead. Now if it comes to it (which it probably won't) any squatters will lose. Lots of places that somebody owns and might otherwise stay empty have people living in them for very little rent for this reason. If you've got a good reputation don't care where you live and don't mind potentially having to leave on very short notice when the real owner wants it back, you can get very, very cheap rent in crazy buildings because of this. People live in unused lighthouses, buildings that used to be part of defence systems, big factories, all sorts of stuff.


There's still valid reasons to keep an empty property though.

Maybe I don't have the money to provide safe electrical / water / heating / fire safety systems. But I also don't want a tribe of homeless people in there.

I also know someone who's kept a property empty for 10 years. He lived there together with his wife, she passed away, he moved out and never had the courage to move out all her stuff.


Breaking and entering or trespassing at the very least.


Breaking and entering. Trespass.


Not the first time either. Every now and again I get an interesting idea, test it and share it with the world. The test that was left forgotten had no commercial impact on anyone and very low traffic.


Your right he did the ethical right thing and informed the sites he’s spoofing, and informed the users he tricked. And he only ran it for a limited period to prove it was possible... Oh no wait he did none of those things. This is not a POC, it’s just a guy running an exploit for five years who thought he did nothing wrong because “If i shouldn’t be allowed entry, they should have used a better lock!”


It's also a big trademark violation, right?


I would say copyright violation.


Yeah, copying the content is definitely copyright violation. But I meant to say that by hosting these sites, the developer could also get sued for attempting to conduct business under the trade name of another entity. And that includes, in particular, hosting that fake Google SERP.


What's not cool at all is the fact exposed here that Google lets anyone trick their users.


Your statement is far too broad and lacks context. Where is this a violation and where is it considered fraud? There must be some countries where this isn't the case or at least where the article and non-commercial use of the technique are considered to be mitigating circumstances.

Also, who doesn't find it cool? You don't seem to be saying that what is described in the article isn't cool, you seem to be making a broader claim that copyright violation and fraud aren't cool.

Lets assume you find what is described in this article copyright violation and fraud, because after all, you said it is. Apparently some people on HN find what the author has done cool, judging by the comments. Ergo, some things that you, specifically, consider 'copyright violation and fraud' are in fact cool.


He copied Google's SERP page, AND copied all of his competitors websites. That's definitely copyright infringement, you'd be livid if you were a competitor, and as a user you'd be pretty annoyed.

It's still an interesting hack, so good to see it being talked about. But it is not ethical and definitely illegal in almost any jurisdiction.


Copyright infringement is a civil case in almost any jurisdiction, not a criminal case.

The USA is a notable exception, perhaps due to the vested interests with deep pockets.


A civil case you would lose though.


Just to be clear: you do not endorse copyright infringement for the sake of being cool, do you?


I believe much of modern copyright law in Europe but especially the US is broken, but in general I don't find crime cool, nor do I find 'cool' a justification in itself to do certain things. Not all laws are sacred though.

I was merely reacting to broad nature of the claims in parent comment. There is a world beyond the US and Europe, laws are not universal truths, they are a representation of what we have come to agree upon as rules to play by. In copyright law specifically though there is often a chasm between what the people find good rules and what companies find good rules. But that is a different discussion.


I guess you don't put locks on your home because you have a 'dont come in' sign on the door right?


I'm curious, how did you generate your content for the spoof SERP page? Was it dynamic to somehow reflect the content of the user's original SERP page (which could be subject to the user's location, browsing history and other factors in G's algorithm)?


Thanks for publishing this. I guess that's not said enough with all of the butthurt people here.


[flagged]


Where did I say I'm proud of this? Everyone keeps saying "proud". I chose to share it in public because it's a serious problem that others may be using it to do real harm. I blog about many things, most harmless and often very useful. I remember one other time when I exposed something broken in Google. I got penalised as a reward.


I think in this situation it would be best to admit that it was improper behavior. You can agree that you should have either

- used your own site

- or someone that explicitly agreed to run this experiment.

Then you can go on that you regret your wrong approach in this case, you will do better next time and finally point out that very little damage was done, which you regret nonetheless.

Then we all move on,

- agree that it is an interesting hack

- and the web browser is a terrible platform security wise.


Why? Why are you siding with the big corporations?


Am I? Which big corporation? dejanseo? google?

I'm siding with dejanseo (the user) because I screwed up myself before. And I will possibly do it again. I see some recklessness but not malicious intent.

This whole branch got flagged away anyway.


> I chose to share it in public because it's a serious problem that others may be using it to do real harm.

There is a process called responsible disclosure, next time when you find a serious problem, you probably try to follow that.

Also google has a Vulnerability Reward Program, so if you report you findings directly to google, you can even get money as a reward.


You are right, you never used the word “proud”. You also did not use the words “problem” or “harm” in the post. So the “pride” thing is mostly tone, subtext or between the lines if you will. This is just my opinion so YMMV


As long as it is subtext and tone you can claim anything regarding another person's character and they have no recourse to argue against you, because it is all in your mind.

Well done.


Lets sum it up - you revealed a bug, and eventually reported it. Good. Showed some technical tricks and creative approach. Thank you.

Bad - amoral and most likely illegal theft of copyrighted content. "Just for fun" ain't gonna cut it. You hurt real businesses, probably because you don't give a f*ck about them, fun is more important.

Is it hard to see that this would stir some controversy to say at least?

Btw calling this "random cool idea" seems like you are proud of this and want some appreciation, hence sharing. If you would be concerned about security, you would share this bug immediately, which is definitely what you didn't do according to your own words.

Things can look significantly different from the other side. You know, the side of the rest of the world.


Surprisingly few comments about the actual attack mechanism here. IMO discussion of whether the author's PoC was ethical is interesting but far less important than the question about how to handle the actual vulnerability; this kind of attack could be used for far more damaging things than just recording user behavior. (Such as phishing.)

IMO "get rid of the browser history API" (as the article author recommends) isn't the right solution. The history API is important, as it's the only way to make the back button work as expected in single-page applications, or in multi-page applications that don't trigger a full page reload when you click a link. Rather, I'd suggest the following mitigations:

1. Require a user gesture for `History#pushState` and `History#replaceState`

2. Follow Firefox's example and highlight the most important part of the domain name in the browser UI

3. Don't label HTTPS sites as "Secure", as this can be misleading (Chrome's planning to do this starting next month https://blog.chromium.org/2018/05/evolving-chromes-security-... )

4. Give the back button a different icon when it's taking you to a different domain (maybe "Up" instead of "Back"?)

Any other ideas?


I think getting rid of the history API is really the only way to do this since it lets pages escape the browser chrome. No code from a page should affect the behavior of the chrome elements. It's always ends up being used maliciously.


Changing the UI in case of a different domain is genius, would really help in enforcing the principle of least astonishment. However I don't think the up arrow symbol would work since it already has a meaning in traditional file browsers to indicate going up a directory.

I can suggest a back arrow behind a no way sign instead, but perhaps it should be something totally different.


Another possibility - if the referring page is a different domain, overriding back is ineffective (the browser just does a “real” back in these cases).


I don't understand. Are you suggesting that if you arrive at a site from Google, the history API should just not work?

For example, let's say a user arrives at a single-page application from Google, and clicks a link on that page to get more information. The site adds a history entry with pushState, but doesn't reload the entire page. Are you saying that in this case, when the user clicks back they should get sent back to Google instead of to the site's home page? If so, that seems like rather unexpected behavior. And if not, isn't the attack still viable?


Hmm touche, this wouldn’t fix much


No; you were right. The browser needs some AI to be smarter but this is all still possible.


To make 4. useful, links with the domain different than the open one should not be allowed to be added to History, otherwise you can bypass it with a different domain. And without 4. this new limitation could be bypassed with a redirect (from the same domain).


> links with the domain different than the open one should not be allowed to be added to History

This is already the case, and AFAIK it's always been this way.

From [the HTML standard for pushState][1]:

> Compare newURL to document's URL. If any component of these two URL records differ other than the path, query, and fragment components, then throw a "SecurityError" DOMException.

[1]: https://html.spec.whatwg.org/multipage/history.html#dom-hist...


You are right that they cannot be added to History, but the code used here changes the back button functionality with

  $(window).on('popstate', function() {
    window.location.href = 'https://example.com';
  });
I just tested it and it works with different domain in latest Firefox.


Fair point; popstate allows you to do pretty much anything when the history entry is for the current domain.

That's not really an issue for this particular attack though, which relies on the reverse scenario: the user remaining on the current domain when they expected to navigate back to the third party search engine.


I think your fourth point is brilliant. You’d instantly gain context just by looking.


Those are great ideas. Would love to see those implemented in Chrome.


This is an interesting yet disturbing case of blackhat SEO and phishing, where the site owner hijacks the back button and sends visitors to fake sites where he can observe their behaviour.

FTA:

Here’s what I did:

1. User lands on my page (referrer: google)

2. When they hit “back” button in Chrome, JS sends them to my copy of SERP

3. Click on any competitor takes them to my mirror of competitor’s site (noindex)

4. Now I generate heatmaps, scrollmaps, records screen interactions and typing.


I'm curious how many visitors did this. In my very limited sample set of myself and friends / work colleagues, we all use middle click to open a result in a new tab.


The vast majority of regular users I've seen go back and forth between search and search results. Heck, I do it from time to time. Most users are extremely "inefficient" by geek standards.


Interesting. Thanks for that. Sometimes we can get so caught up in our own tech-bubble that we don't really notice the usage patterns of the average user. So it's often nice to have a reminder like this.


> Most users are extremely "inefficient" by geek standards.

How is that inefficient? I use both interchangeably and I don't see how it's any less efficient than opening a new tab and then having to close it if it's not what you want, or having to close useless tabs if the first one is all you need... On my Macbook, I just swipe right and I'm back at the search results.


For a lot of things it helps to compare things in parallel. With this technique you only see 1 result at a time.


You assume you're back at the search results. As OP proved that's not necessarily the case.


I'm not sure my parents are even aware of the existence of tabs, to be honest.


Yet another reason to browse with JS disabled by default.


That's a reasonable course of action until you need to use the internet for pretty much anything.


Experience teaches that that is a vastly exaggerated statement. There remains quite a lot of the World Wide Web that does not require Javascript.

And of course it is pretty much not required at all for using the Internet outwith the World Wide Web.


And then classic React enters the building


The Internet works just fine without JavaScript: DNS, FTP, SSH, SMTP, NNTP — none of them have ever required JavaScript. Indeed, HTTP works just fine without JavaScript. HTTP pages perform better without it.

Granted, many broken and ill-programmed HTTP pages aren't useful without JavaScript. That's no an indication of how useful it is, but rather an indication of how poorly-skilled those webmasters are.

Then there are web apps; they indeed don't work properly without JavaScript. Fortunately, there just aren't that many important web apps. To be honest, I can't think of one web app that I regularly use, other than Google Meet.


Your distinction between web pages and web apps is entirely arbitrary. Many web pages use JS in such a way that interacting with them without JS is a lesser, if not broken, experience.


Example: Is reddit a webpage or a web app? Correct - it's both.


Works pretty well for me.


Yeah I know, personally I prefer to use paper, way safer. I plan to just do the full TCP request by hand using the ethernet wire, morse-code-like, next./s

Sure increasing the functionalities increase the risk, it doesn't means the risk isn't worth it or isn't mitigable. Worst case, he fake your back button.. it's not that bad seriously. Google will probably try back buttons and different similar situation now on their engine and deal with theses cases one by one.


really? That seems extreme.

Might as well say:

"Yet another reason to not browse anything on the web"


It's a shame you're being downvoted, you're entirely correct.

Most of the modern web is unusable with javascript disabled.


>Most of the modern web is unusable with javascript disabled

This isn't wrong - but it assumes most of the modern web is worth using. Most of the modern web isn't worth browsing, and every site I've ever come across that is worth reading works just fine without Javascript. I'll continue to browse the internet with Javascript disabled-by-default. It's a surprisingly good filter.

With that being said - while this is "another reason" it is as minor of a reason as it gets...


> assumes most of the modern web is worth using

But it is, are you really going to ignore reading some huge breakthrough in physics because the site uses react? Also in many situations there's absolutely no other choice. Government sites, e-stores, banking.

And then there's the buildup of recorded urls. Private browsing is somewhat less useful when your scipt blocker whitelist is full of porn sites.

I use it for security in specific browsers but happily admit it's not an actual solution for normal people. Adding another 3 clicks, then another 2 for the inline JavaScript contained within after reload makes the internet incredibly annoying to use.


>dding another 3 clicks, then another 2 for the inline JavaScript contained within after reload makes the internet incredibly annoying to use.

Yes, it is annoying. It reminds me each time how annoying websites are which use Javascript for things which could be done without. And it lets me search for alternatives or just abandon such websites.


> how annoying websites are which use Javascript for things which could be done without

A good example of sites which use JavaScript for things they don’t really need are those GP mentions: ‘government sites, e-stores, banking.’

Government sites: the vast majority of government sites are simply informative text. There’s absolutely no need for me to grant the government permission execute code on my computer (which is what JavaScript does) in order to read the minutes of the latest council meeting. Even when interactivity is needed (e.g. an online tax-payment system), HTML forms (the sort we’ve had for over two decades) are a perfectly good solution for ‘enter information in a box and submit it.’ JavaScript can definitely lead to more attractive, more usable solutions — but it’s completely optional. Government sites are a great example of something which should work for anyone, even someone using an old BeOS box on the other end of a modem connexion running over a bit of wet string.

E-stores: there’s simply no need for JavaScript to display pictures & descriptive text of goods in an attractive fashion. There’s simply no need for JavaScript to give me a form to enter my credit card information & mailing address. Again, JavaScript can make the experience better, but it is also a privacy and security risk. I seem to recall that Amazon made quite a lot of money before JavaScript was a thing; I imagine it could continue to do so.

Banking: there’s no need for my bank to execute code on my computer to send me a statement of my accounts, nor to give me a form to pay bills or send money. Indeed, in my experience JavaScript just makes things worse, because instead of downloading a single HTML document from my bank’s servers I get to download dozens of trackers and bugs, as well as the code necessary to hit multiple APIs and stitch the page together out of its parts on my own desktop.

I think I read something yesterday, here or elsewhere, about how client-side JavaScript really took off at the same time as server-side Ruby was a big thing, with the implication that the reason was that Ruby was so slow that websites had to offload as much computation as possible. I don’t know, now, if that was actually the case, but I do know that it’s 2018 and my desktop experience is slower than it was in 1998, thanks to JavaScript.


JavaScript is popular for the same reasons Flash was popular.

a) It objectively can make web pages more usable and convenient.

b) The fancy animations and other effects make marketers and managers happy.

c) You an use it to build interactive games, which many users like.


2FA authorization needs JS if you want to use it conveniently, otherwise you would have to refresh all the time


> 2FA authorization needs JS if you want to use it conveniently, otherwise you would have to refresh all the time

How do you mean? IME 2FA works via an CLI utility or mobile app, and JavaScript doesn’t enter into it at all.


Many modern web pages work better with Javascript disabled now, because it avoids the GDPR/cookie popup spam.


fun fact

Not browsing the web at all, also avoids the GDPR/cookie popup spam.

That solution works 100% of the time. It blocks out 100% of the GDPR popup spam.


The thing that concerns me is it's easy to wrap a JS API or unreference it if I know it's abused for ads/tracking. I don't know of good ways to go about tracking based on pointer events or scrolling.

I'd put more research into generating fake events or limiting them for an untrusted site, so mouse behavior can't be used for "where are they looking" analysis.


I wish... even GitHub won't work properly without JavaScript enabled these days.


IME, most parts of GitHub work fine without JS enabled. (Though sometimes I have to disable CSS to get my hands on some forms…)

This is unlike major competitors (GitLab, Bitbucket), which are completely broken.


Hello everyone, GitLabber here! We had a similar issue about this [1], and we raised another one when deciding to further clarify our documentation regarding this question [2]. You can find out more about our motives behind this decision there.

[1] - here https://gitlab.com/gitlab-org/gitlab-ce/issues/36754 [2] - https://gitlab.com/gitlab-org/gitlab-ce/issues/43436


Ironically, you have to enable JS to see comments for these issues…


Not really ironic, it tells you what the answer in those comments was


Completely disabling Javascript isn't feasible, but you can use uMatrix or NoScript with all scripts blocked by default, and only whitelist ones on sites/domains that you trust.


Exactly. I've been browsing script-free for twenty years. It used to be a real pain switching on and off, until Opera introduced easy by-site settings. But then, back in the day most sites were perfectly servicable without any scripting whatsoever. And these days, uMatrix makes it all a snap.


In that case, just not using useless thing like "back" button will be perfectly fine.

A good website don't need using this button.

And to navigate between websites, using a tab for each website is fine. Especially when comparing results from Google.


... Don't use the back button? What? I actually kind of like the ability to move between pages and domains with the back button.


A good website give you the ability to move without this button.

It's like Android VS iOS.

The first one has a back button, the other don't.


> A good website give you the ability to move without this button.

I disagree, fairly strongly. Re-implementing behavior that the user already has in their client is at best superfluous, and at worst very confusing.

> It's like Android VS iOS. The first one has a back button, the other don't.

iOS apps implement history as part of their UI because iOS doesn't have provision for one in the default UI. This is changing, as gestures become more and more common. I don't think I've actually used a "back button" on my iPhone in months. I pretty much take its presence as communicating that it's possible to go back, not as a means to do so.


I agree. User experience is so overrated these day, right?


> noindex

So, the browser extension indicating (with big red fonts) that this site is noindex could be a simplest solution?

For not power users who don't know about any extensions that would be not so easy though. If that function will appear in Chrome enabled by default, that would raise questions about Google motives, obviously.


I think noindex is nice to have but not neccessary for this trick.

The only solution is to fix the back-button bug/vulnerability in Chrome.


It doesn’t seem like there is a fix, short of removing the history API.


Maybe restrict the history API to the same-origin-policy? Javascript could/should be allowed to manipulate browser-history only for the same domain. Just an idea.


That’s already the case! The history API only supports the current origin. From MDN:

> The new URL does not need to be absolute; if it's relative, it's resolved relative to the current URL. The new URL must be of the same origin as the current URL; otherwise, pushState() will throw an exception. This parameter is optional; if it isn't specified, it's set to the document's current URL.

https://developer.mozilla.org/en-US/docs/Web/API/History_API...

The exploit in this article clones the appearance of Google results and competitor websites but leaves the user on the exploiter’s domain, so users who are savvy enough to notice the URL wouldn’t be fooled.


Why should anything be able to change the behaviour of the back button? If I click back it should take me back to the previous URL. If it breaks your one page 200MB JavaScript masterpiece then tough luck, come up with your own navigation.


Suppose we did what you said, and the back button only ever took you back to the previous URL.

I could still make a JS app that, on your first interaction with the page, moved you forward from https://example.com/ to https://example.com/#home. Then it sets a variable such that when you go back to https://example.com/ it shows a fake SERP. This is not an easy problem to solve.


This is a redirect, and would be trivial to detect and override at the browser level.

Actually, the back button should auto negate redirect pages


It just change `windown.location`.

And it would be very limited if JS can't change `window.location` to outside its current domain.


This exploit uses the history API, which allows JavaScript to change the URL in the browser URL bar to another URL with the same origin without actually causing a new full page request. The same-origin policy has always been in place, because it would obviously be a huge vulnerability to allow any web page to pretend to be a different website.

Changing window.location is different: it allows you to change the browser URL bar to any URL (including google.com, etc.), but it actually causes the browser to do a normal page load of the new URL, just like if the user had clicked a link to the new URL. Thus there is no spoofing vulnerability exposed by the window.location feature.


That's not even a solution.

User clicks on your site. You redirect to a fake search page and then redirect to your page after setting a cookie. Now back button sends them to the fake search results.


Independent of anything else, allowing the back button to take you back to a page that redirected you previously is bad UI. It is almost never the desired behavior.


Back button is not the only way to end up at noindex site.


? Sure, you can just call the URL directly in browser. Which other way do you mean?

The problem is not to end up at "noindex site" (btw: noindex is not a neccessary part of this scheme). The problem is to end up at "noindex site" thinking that the "noindex site" is a competitor site. And I don't see how such deception is possible without the backbutton-bug.


> The problem is to end up at "noindex site" thinking that the "noindex site" is a competitor site

And there is no solution for that, it seems. The solution for the problem <red in address line>'hey, that's noindex site!'</red> is obvious and simple.

> And I don't see how such deception is possible without the backbutton-bug

You told it yourself - 'you can just call the URL directly in browser'. And there are many ways and scenarios how that clicking on a link could happen.


Somewhat related, Google AMP is also destroying the ability for users to trust URLs. In fact it’s kind of the inverse problem; the URL bar says google.com when the user expects to be on another website. I wouldn’t be surprised if observing the AMP pattern subconsciously made users less suspicious of the trick in OP.

It’s also a bit rich to see all the outrage here and deranking by google, since hijacking/proxying to sites in search results is exactly what AMP does.


As far as I know, site owners essentially have to opt-in to AMP by restricting themselves to a subset of not-exactly-standard web design methods (and may need to explicitly opt-in, I forget).

So I don't see a way to call AMP hijacking, since its done with developer permission.


My blog was built when AMP was new and looked cool. It was done with my permission - but the content was not vetted by Google in any way.

People have told me they asked Google to take down my blog, because they thought it was hosted there.


Wouldn't be a hn thread without someone mentioning how AMP is literally the worst thing ever.


I don't understand why you would have been expected to report this to Google. It's not an issue or bug with Google, it's a simple gray hat social engineering trick.

People linking to fake sites as a dark pattern is nothing novel, you just did so too capture analytics instead of, say, installing a virus or taking someone's credentials. That said, you certainly could have done the latter and gotten views into your competitors' user portals. In my head that's not fundamentally different or more unethical from what you ended up doing.

I don't necessarily begrudge you for trying it, but I don't think it's for a noble reason nor do I think it was particularly innovative and the end result is Google doing something unsurprising.


The expectation isnt to report to Google. The expectation is to not do this on live sites affecting real people.


the reference is to text in the article, not comments from HN.

From the second paragraph:

> Many are suggesting the right way is to approach Google directly with security flaws like this instead of writing about it publicly.


For context: Firefox greys out anything that is not the "real" domain, which remains black. So:

google.com.fakesite.io/foobar

becomes:

(grey "google.com.")(black "fakesite.com")(grey "/foobar")

This makes it at least a little more obvious you're not on Google.

Although that's still a tricky one for non technical users to protect against. Aside from EV, I can't immediately think of anything else a browser could systematically do to guard against this, to be honest. Blacklists etc, but that's very unsatisfying.

It's a pretty old problem, to be fair. I remember almost being phished this way myself back on Myspace, were it not for Firefox's blacklists catching the form submission.

Domain names being little endian has been one of the most expensive web sec mistakes in history.



I never noticed this but now that you point it out it's really nifty!


> Domain names being little endian has been one of the most expensive web sec mistakes in history.

Can you clarify what you mean by this?


Presumably that authority works from right-to-left. .com, then domain, then subdomain. It would be easier to gauge trust if it were left-to-right.


Ah I see, thank you.


> Record actual sessions (mouse movement, clicks, typing)

> I gasped when I realised I can actually capture all form submissions and send them to my own email.

How many bad actors have been doing the same and for how long? This doesn't sound like something Google should just brush under the carpet and expect no one else is doing it. Although I wish the author had reached out to Google first to see how they would have handled it, I thank him for publishing it.


You're welcome.


The big issue here is: Who does our browser work for?

People worry that self-driving cars will take us to "promoted" coffee, if we're not specific. More generally, software agents as a rule are loyal to their creators, not to us. That we put up with this is absurd.

Browsers should be intelligent agents that are entirely loyal to the person browsing. For example, no site should be able to tell whether we see ads or not. As one site-by-site option, process the ads exactly as if they're reaching our senses, but don't actually render them so they reach our senses.

Not even having a back button loyal to us? That's obscene. Copyright infringement is the MacGuffin in this movie; the real story is that we're wusses for having totally lost this balance of power struggle in our personal software.


OSS is our best bet because the users can be the creators. Firefox is a mixed bag on this.

What I want is like the equivalent of fiduciary duty [0] but for AI and software. This is why I don’t like the idea of “free” agents driven by ad revenue.

Currently I have to manually review and build my own stuff. Not sustainable.

[0] https://en.wikipedia.org/wiki/Fiduciary


Disturbing, fascinating, obvious in hindsight.

Here’s another angle: a “bounce” back to google too quick is a negative ranking signal. By keeping them from going back to google by making them think they in fact did makes this also black hat seo.


The slashdot trick. Still works and is actively used right now.


what exactly is the slashdot trick?


But Google doesn't see the bounce back. The site bounces back to a copy of Google's result page.


That is what the parent comment points out. He benefits from the fact that users can't return to google.


That's the point: Google doesn't see the bounce back, so you won't be negatively affected.


That's what they said.


Why do browsers allow changing the back button history before the visitor arrived at the domain? Seems like a subtle cross origin attack if that is truly what's happening.


I can imagine you could work around that issue just by once redirecting onto your own site first.

On the surface, sounds like a difficult problem to solve safely. On a related note, I often have the back button not work because I hit back and chrome cached a redirect to some other page and it immediately redirects again before i can even spam back again. Need to long press back to get a longer history to go back further.

This is a really interesting "attack" to see.



Because there's a 300-comment thread on Hacker News[1], where people complain that modern web-apps don't respect the back button (Those people want it to go 'back' in state, inside the webapp, instead of bouncing you back to the previous website they visited.)

They say that it's easy to build a webapp that correctly uses the back button, to go back in state inside the application.

What they don't realize is that it opens up the security hole outlined here. When you allow the page you're on to overwrite your back button's behaviour, you get shit like this.

[1] https://news.ycombinator.com/item?id=17767260


Most people are just going to hit back until it looks like they're at the right site, so I don't think you'd have to change the previous history, just add your own entry after the one for the site they came from.


That would require storing where they came from as well as waiting for--or forcing--them to visit at least one more page on the attacker's site. So it'd still make backjacking harder.

Maybe pages that immediately redirect on first arrival should also not count toward back history.

Now a more perfect solution would require browsers snapshot where the user cane from then block or warn about pages at the destination that look too similar. Though that seems unnecessarily complex for most users.


Easy, just push multiple history items.


I'm surprised he's willing to put his real name to this. I can't immediately see that it's actually illegal, but it still screams red flag for unethical behavior.


Not just his name, his whole company even!


How does a person get so much flak for hacking - on Hacker News?


Maybe because we're talking about Google? Seems like whenever Google is called into question on HN I've noticed a lot of appeals to authority and people defending them to a fault.


I'm not sure if you've noticed, but this isn't actually a website about hacking.


In many ways this is malicious deception. In any instance where a login form is included in the scraped mirror, that represents an attacked user, and a phishing attempt.

If someone did this in the wild, in an uncontrolled situation involving random strangers, it risks serious misinterpretation, and worse.


> I had this implemented for a very brief period of time (and for ethical reasons took it down almost immediately, realising that this may cause trouble)

The author did this in the wild, involving random strangers.


While we're on this topic, I have a related situation and wonder if my case is common:

I built a brochure site for a mom-and-pop business a decade ago. The domain expired some time ago, and it was snapped up by someone who repopulated it with the original content scraped from the Internet Archive. It looks and behaves exactly like it did when I controlled it, except that a phrase in the frontpage content now links to some supplement sale site.

Is there a name for this SEO bullshitery? What can someone do who isn't American and who therefore can't file a DMCA.


Sounds like that old site got bought by someone building a PBN (Private Blog Network).

They buy old domains, get the old content from archive.org, and then add a link in somewhere to their "money" site, or to another site in their tiered linking structure.

It's a BS tactic that can sometimes still work, but it's a LOT of effort to really keep up with it. TBH it's much easier to just actually make a site people want to use and reach out to people who might be interested in sharing it.

Hosting/managing 100's of sites just to prop up 1-2 money sites is too labor & time intensive for most of us. That said, there are some people making good money still using these tactics, as shady as they may be.


I've seen Dejan speak and I'd recommend following his work because he does very interesting black hat things like this in SEO. He has so many out of the box ideas like this which are brilliant.


Long time ago I wrote about this technique: http://mixedbit.org/referer.html Besides back button navigation, I also had ideas to use a fake malware warning or just take a victim directly to fake search engine results.


Update from site "Google’s team has tracked down my test site, most likely using the source code I shared and de-indexed the whole domain."


They don't up/down ranking individual sites for stuff like this. They've probably implemented back-hijack detection for the whole web


I just added a screenshot from search console: https://dejanseo.com.au/competitor-hack/

There's no manual penalty notice.


They de-indexed the domain. I think they do do this on an individual basis from time to time, especially with sites distributing (links to) illegally obtained content.


They didn't down rank it, they delisted it just like they would delist child porn or copyright material after receiving a dmca notice


I've read otherwise but couldn't find anything with a couple min free time. Anyone else read/know about this?


That would be marvellous.


This is easy to hate on, and certainly ethically dubious....but man do I love it.


[flagged]


Hey, that's not very nice.


This seems to have some fairly scary security implications if used maliciously, but I can't think of a good way to protect against this.

Does anyone know of a browser extension to limit access to the history API?


I started using NoScript a while back, just to see what the web is like without Javascript. My plan was to uninstall it when it got too annoying, but to my surprise it's actually not bad at all. I'm quite lax in whitelisting domains I actually trust, but even then it's nice that it doesn't load Javascript from all other umpteen domains, which is often the case.

Of course it's a very blunt weapon for blocking abuse like what's described in this blog post, but for sure it works.


In my mind, not giving the user opt-in control over this setting is a bug in the browser.


A couple of years back I was talking to someone who did SEO for a popular education network. The company was spending millions of dollars every month on SEO and advertising.

Their module operandi went like this:

1. Offer money to license or buy a smaller competitor's content

2. If that doesn't work, crawl and clone the site

3. Pump a lot of money into Google Ads, so that the cloned site now appears as an ad above the legitimate site. Google makes such scam easier now by making the ads look like organic results - a non technical user would hardly notice.

4. The legitimate site just dies.

I was asked to build a tool which crawls sites, which I refused. But I learned how professional SEO works.


This is somewhat risky, no? It's a clear case of copyright theft, and would be trivial to sue unless they hit everything behind cutouts to the extent they weren't traceable.

One would think the owner of the cloned site would notice lower traffic, search, and notice the ad scam. This strategy sounds like it would take months to execute before the competing site died, if not longer.

Am I missing something? I've read lots of seo scams that seem very hard to undo. This one seems....slow, very avoidable, with large legal and reputational risks for the perpetrator.


It’s never trivial to sue.


No, but in a case with a clear paper trail it wouldn't be hard to send a very clearly worded cease and desist showing exactly what damages are expected and how easy it is to prove.

Depending on the size of the company would also be possible to raise a big PR fuss, get on top of Hacker News, etc.

Plus there is DMCA takedown, google's tools, etc. Those are trivial to use.


This is on the same level (IMO) like when Site A pays SEO company to make SITE B appear as SPAM (usually by spamming SITE B's links on commenting systems with some vague texts).

Had read this on discussion forums a lot when "Scrapebox" and the likes were used (2010-12ish)


Sorry, calling BS on this one. By simply copying a site you get flagged as having duplicate content.

Then there's DMCA. I've seen an e-commerce site's homepage get de-indexed, killing the business, due to 1 single image being used for which the site owner didn't have copyright.

SEO undoubtedly has many shady practises, but "professional SEO" is actually really difficult and involves much more than cloning competitor sites and somehow getting away with it.


"duplicate content" is a problem for both, the original site and the copycat. But the copycat doesn't rely on SEO in this example. It just buys traffic in AdWords. So the duplicate content penalty would harm only the original site.


Google must surely be able to tell the difference between the original site and copying site because of timestamps. How would an Adwords campaign change that?


Maybe Google can tell the difference, but, as I did SEO some years ago, we didn't relyed on it. Duplicate content was considered a problem regardless of who published first.

Duplicate content is a problem for organic rankings. In payed search it may be a problem for the quality factor (not sure). But even if it impacts the quality factor you just have to pay more to achieve the same result.


Its the Golden Rule: He who pays the gold makes the rules.


1. You don't do this (plagiarism) on your main site.

2. DMCA is a US law. It may or may not apply, depending on the company I referred to. Also, going to court depends on a lot of factors.

3. You don't necessarily need to clone verbatim. You could generate content automatically (or with manual help) targeting the same keywords, but based on parsed content.

4. This is not trying to be in organic search results. Promoted solely via ads.

Sorry, my experience has been that a SEO and content generation (as it happens today in Google and FB) ranks high among shady and manipulative practices. Add: Of course, there are many good companies too.


How much is "duplicate content" penalised in reality? How many times have you searched for an error message or technical issue and got a link to StackOverflow, and also on the front page some ad-laden site that's just a direct scrape of the exact same SO page? It's a common occurrence for me.


Is this kind of blatant censorship, where Google delists information it doesn't like common? It's not like the experiment was ongoing, is it.


fun fact: you can do the same thing again, but use the AMP version and call yourself an amp-provider, just like google does.

technically they wont be able to complain because you can say providing amp content assumes they want to be served by you, and you can fiddle as much as you want (e.g. adding tracking code) just like google does when it serves someone else content as amp.


I just realized that it is not necessary to hijack the back button!

1) Watch out for users coming from Google (or Bing) using the referrer field.

2) Choose randomly In 5% of them are redirect them to your shady domain using a temporal 303 redirect. [If Google notice this, they will hate you.]

3) Host a copy of your competitors page in the shady domain, with all the tracking enabled. [This is illegal! You may get a lawyer C&D, nastygram or more.]

I guess that when the user finds your site in Google and click the link, they will most of the time not be sure of with link they choose, so they will not notice the change. And if they realize that they went to the wrong site, they will click the back button and click the search result again, and get the normal page like the 95% of the people.

This is probably more credible if the search field in the referrer doesn't have your site in it, so the user is looking for any generic site that includes you and your competitors.

As I said before, this is shady and some parts are illegal, so don't do it. Google may demote your site, and also you can get some legal problems.


This is really a good example why it is so difficult for security experts to do research and experiments where real users are involved.

What Mr. Petrovic did is illegal in most developed countries: copyright violation (copying web pages) and monitoring and storing user behavior without their consent (and, even worse, by phishing). It doesn't matter that he did it for a "very brief period of time (for ethical reasons)". LOL. If I tried this kind of stuff where I work, I would have a long unpleasant talk with our legal/ethics department afterwards. I cannot even do a network scan in the Internet without first notifying God and a couple of lesser gods.

I am also wondering whether that's good publicity for the author's company. The author is basically saying: "We are doing things without being fully aware (or without caring) of the legal consequences. Are you sure you want to be our customer?".


When you do security work, that's an important part of your job. Sure, in many scenarios like traditional pentesting you can probably do fine within the legal boundaries in most jurisdictions, but as soon as you do serious security research when you actually test your ideas in practice, you're likely to cross the line sooner or later. It's a difference between "it should probably work" and "yes, it worked, I tried it." If you're afraid of the latter, don't get involved in security as you'll get burned sooner than later.


> you're likely to cross the line sooner or later.

That's basically the opposite of what security researchers working for companies and research institutes are doing. Document everything, get written consent of involved parties and sometimes even inform the police about a planned action. Make sure that you (a) don't cross the line or (b) move the line legally further away.

Of course, there are security experts who don't care about that. But they usually don't publish their results on a website with their real name.


I'd argue the most interesting and important research is done in this way. It's not that these security experts "don't care", it's just the very nature of certain problems that you need to test them against real users (as opposed to, say, testing an exploit against a system). Consider, for example, honeypot research the very nature of such scenarios is that you can't even hint that users are tracked, let alone asking their consent.


The legal aspects of honeypots were discussed a lot when they became popular. Just two examples:

https://www.symantec.com/connect/articles/honeypots-are-they...

https://www.researchgate.net/profile/William_Yurcik/publicat...

And they are still discussed, for example in the light of the new EU laws:

https://jis-eurasipjournals.springeropen.com/articles/10.118...


And they show quite well you can't have the cake and eat it. For example, Spitzner suggests displaying a banner... With all due respect, it's ridiculous. The whole point of this game is to make the attacker believe they're attacking the real system, not to make sure they "waive their privacy rights." I don't think anyone serious about really analyzing the behavior of attackers would ever care about these things. What is more dangerous is if a honeypot is used to attack another resource and you're sued by the owner, for example. It's really hard to avoid breaking a couple of eggs, no matter how you try.


> I'd argue the most interesting and important research is done in this way

Links please :)


I'm sorry, I don't think I can provide any links. As an example, imagine you found out about Stuxnet much earlier before the general public.


> But they usually don't publish their results on a website with their real name.

I can name a few who do, but I personally despise them after previous interactions with them and thus don’t want to inflate their ego with a mention.


In this particular example there's absolutely nothing that required touching or duplicating the sites of others, the same PoC would work just as well without using a "competitor's website", it could be tested in practice simply by using multiple domains/sites that you fully own.


he hasn't done it for a brief period of time, the article says he's run for five years on one of his disposable domains?


What's concerning is that the post author seems not to see the problem with trying to sit on both sides of the fence at once.

As others have said, the way this was done is likely to be against numerous laws in most major jurisdictions. If you wish to do this as a PoC then simply put a notice up on the page that initiates it and use dummy "competitor" content, so you've got some semblance of user content/transparency without copyright infringement. That would work just as well for flagging it up as a concern to others.

Or if being up-front about it is not the side you are on, do this fully admitting that it's wrong and face any consequences (it doesn't sound like this was the post authors aim, esp given follow up comments).

For a "very brief period of time" doesn't cut the mustard here, just as it wouldn't with briefly stealing something from a bank or briefly kidnapping someone (both crimes where one could sometimes argue there may not be permanent damage, although even that likely isn't true in many cases)


The problem I'm seeing is not that the author did something un-ethical (there are plenty black hats out there with no such concerns), but that the content can modify the browser chrome behaviors, and that the users trust the browser chrome a lot more than the content (as it should).

As a workaround, I recommend using separate Firefox containers for big sites, as the big sites are the main attack surface of a lot of people. I.e. Firefox containers for Google, Facebook, Microsoft. This attack would be stopped by using a Google container as the back button will not work once you step out of the container to go to the result page.

Sure, it won't help you on a targeted attack, but will help a lot with this kind of drag-net attack.


I don’t think your proposed workaround would help. The user stays on the malicious domain. Unless your container is clearly marked visually, only the url shows the savvy user that something is off.


Not sure what you read, i read about a guy who was aware of it and handled it good enough.


There's no need for "Not sure what you read" - it seems likely we read the same article, and simply that your opinion differs from mine, which is perfectly reasonable.

What causes me to consider that the post author's handling was not "good enough" is that this demonstration needn't have gone ahead with what seems to have been content copied without permission and then served up to people without their knowledge when they rightly expected it to be genuine content that had not been interfered with.

I didn't for a moment suggest he wasn't aware of it, I suggested he seemed to think he was able to do a bad thing whilst being good. This wasn't a scenario where the only option was to test it for real.


1 Years years ago when I was learning web development I bought a TLD and just copy-pasted Amazon’s log in page to just check how it works. Amazon somehow found out about this and Google punished that TLD after that incident and it just couldn’t go up in rankings after that.

If I remember correctly they had even put that TLD on sites that report/list “phishing” sites so if you Googled about that TLD you would also get the “they are fraud” results.

2 I think that most pro users just New-Tab everything and go from there. Seems to me that going in n out search results all in one tab is kind of slow too.


Pretty simple to find out by logging the referring site that is requesting assets from your server. I used to do this all the time about a decade ago when it was common practice to steal entire site designs.

My favourite moment was when we changed a picture that someone was hot linking to as part of their own website and they emailed us with a rant saying "how dare you change the image"


I think you mean domain, not tld.

Tld is the last part of the domain, the .com, .eu, .in, ...


You bought a TLD when learning web development? That seems extreme.


Rookie mistake. They wanted just a domain, but accidentally the whole TLD.


This must be sarcasm.


This is certainly sarcasm.


Yesterday I was learning about processors, so I bought a foundry.

Doesn't everyone do this?


Misread "TLD" as "domain." Whoops.


You meant domain name, not TLD, right?


Uhh no, I created a country and ICANN assigned me a two-letter TLD. Was this not a part of your webdev bootcamp?


One more reason to kill the referer HTTP header, I guess.



> I had this implemented for a very brief period of time (for ethical reasons) and then moved to one of my disposable domains where it still runs after five years and ranks really well, though for completely different search terms.

Am I reading this correctly? He's been doing this since 2013 and still wants to use the white hat card?


Instead of downvoting I'd love if you could reply instead...


Interesting hack. Sorry about the whole google de-indexing thing. My question would be, did you really gain any useful insights? From competitors, you can normally figure out which page is their most viewed and then figure out how they merchandise it on their homepage. Without "hacking" it.


Just remove the push state history api. Set state is fine.

Push state is totally unnecessary since we already had a technology for this: anchors!

Instead of site.com/my/page, it's site.com/#/my/page.

What is wrong with this? It does literally everything you need and is supported by most routers out of the box!


If Chrome outright disables JavaScript's ability to alter the "Back" path, it may brake some (poorly designed) applications. A compromise is to prompt with a warning.


Break them then, with no warning. Letting Javascript trump browser controls (or spam confirmation popups) is a problem that never should have been allowed to live beyond the 90s.


But changing the behavior suddenly can make existing applications outright not function. That results in angry customers. A prompt is a decent compromise. Example prompt: "This website has altered the web address of the Back button. This can be risky. Do you want to use the application's version of the web address, or the original address? [Altered address, Original address, Cancel 'Back', More-info]"


Will Google be pushing out a fix for this vulnerability?


It seems my habbit to open google links in new tabs with right click have more meaning now. I initialy used this to avoid referal information.


Doesn't change the referer but avoid you falling in that current trap.

Anyway, using a new tab for each new website you visit is the way to go I think.


Maybe it's a good trade off for this to become default behavior in browsers (in the background unseen by users).


If user does not see that she is operating in a new tab, she can still click "back" and would still be vulnerable to the "Fake Google Serp" trick.


Google even has a settings options where all SERP links open in a new tab. I personally use it myself.


That's not going to stop the referrer from being available to the page. In chrome hit F12 and in the console type

    document.referrer.


Bonus : you steal some ranking clout from competitor since they don't get that precious click on Google search results


That huge fixed navbar on mobile is just horrbile. Can't read the article because of it.


That’s genius


[flagged]


We ban accounts that post like this so please stop and read the guidelines: https://news.ycombinator.com/newsguidelines.html.


And people still think javascript is a good idea...


Presenting yourself as someone else is called fraud in my book.

Changing the back button might be clever but all the rest is just simple. But people don't do this because I think in a lot of countries this is illigal.

There is a way you can protect your site a little from this: add canonical tags to all your pages. When an attacker updates the back button to Google they will have a hard time getting the cloned pages up in the results.


In this case he NoIndexed the clone pages anyway.


Honestly, it doesn't shock me in the slightest that someone who markets themselves as an SEO expert would not only do something as unethical as this, but also brag about it, as though they think they've done something they should be proud of.


Is this that different than publicizing bugs? He tested it for a small amount of time, noticed a real security vulnerability (he could collect leads), and publicized his findings knowing Google would likely punish him for it.

It's mildly unethical at worse, considering he could have happily done profitable leadgen at scale and it would have likely never been caught if he kept quiet.


That's exactly right. Google has already punished me once for exposing a flaw (which they still haven't fixed): https://news.ycombinator.com/item?id=4748094


Except he would have been caught by a few of his users.

If I notice a scummy page impersonating Google, I'm gonna alert Google so they can do something about it. (For example add the page to the safebrowsing list)


Google very very rarely does anything with individual complaints sent in with their automated systems. You'd have to get a post to the top of HN.


"Years"

"Small amount of time"

?!?


FWIW, Dan (the author) has an outstanding reputation for professionalism and integrity in the marketing world. If he says he did something for ethical reasons, to those who know him, he's earned the benefit to be believed. (If you don't know him, you'd be forgiven for being suspicious)

And credit should be given to him for educating everyone on this exploit.


It'd be very easy to make a proof of concept of this exploit which didn't breach copyright or record peoples personal information and then to publicise the problem immediately, instead he chose to operate on real sites, collect real personal data and then forget about it for 5 years. It's this general lax attitude that gives everyone working in the SEO sector -- and by extensions the tech sector as a whole a bad reputation. The whole experiment doesn't feel like it was conducted in good faith or with any consideration for the ethics beyond 'hey this is cool'. Grow up!


Thank you Cyrus. I thought it would be obvious that this isn't a practical tactic a reputable brand could risk doing.


While it is clear that you did not have any bad intentions, you should never have published it on the web. Based on your earlier comment "It worked a little too well" it becomes clear that multiple users were tricked by your site and that you possibly even intercepted submitted forms ("I gasped when I realised I can actually capture all form submissions and send them to my own email.").

You misled people and breached their privacy. This is as simple as it gets, even if it was for an experiment (though leaving the site online in some other form still raises a lot of question marks..).

My advice for you is to perform future experiments locally, not on the web and make sure people participating in your experiment are aware.


The point of the experiment was the social engineering aspect. The fact that it would work technologically was obvious. The fact that it would work practically was what he set out to prove.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: